From d6573464deaf572e492b2e2815a56c6b3632f045 Mon Sep 17 00:00:00 2001 From: nenharper Date: Tue, 16 Sep 2025 13:53:49 -0500 Subject: [PATCH 1/5] Create new data loader page --- .../developers/applications/data-loader.md | 186 ++++++------------ .../reference/applications/data-loader.md | 179 +++++++++++++++++ 2 files changed, 244 insertions(+), 121 deletions(-) create mode 100644 versioned_docs/version-4.6/reference/applications/data-loader.md diff --git a/versioned_docs/version-4.6/developers/applications/data-loader.md b/versioned_docs/version-4.6/developers/applications/data-loader.md index ba9f433c..1d5bafcc 100644 --- a/versioned_docs/version-4.6/developers/applications/data-loader.md +++ b/versioned_docs/version-4.6/developers/applications/data-loader.md @@ -4,178 +4,122 @@ title: Data Loader # Data Loader -The Data Loader is a built-in component that provides a reliable mechanism for loading data from JSON or YAML files into Harper tables as part of component deployment. This feature is particularly useful for ensuring specific records exist in your database when deploying components, such as seed data, configuration records, or initial application data. +Now that you’ve set up your first application, let’s bring it to life with some data. Applications are only as useful as the information they hold, and Harper makes it simple to seed your database with initial records, configuration values, or even test users, without needing to write a custom script. This is where the Data Loader comes in. -## Configuration +Think of the Data Loader as your shortcut for putting essential data in place from day one. Whether it’s a set of default settings, an admin user account, or sample data for development, the Data Loader ensures that when your application is deployed, it’s immediately usable. -To use the Data Loader, first specify your data files in the `config.yaml` in your component directory: +In this section, we’ll add a few dogs to our `Dog` table so our application starts with meaningful data. -```yaml -dataLoader: - files: 'data/*.json' -``` - -The Data Loader is an [Extension](../../reference/components#extensions) and supports the standard `files` configuration option. +## Creating a Data File -## Data File Format - -Data files can be structured as either JSON or YAML files containing the records you want to load. Each data file must specify records for a single table - if you need to load data into multiple tables, create separate data files for each table. - -### Basic Example - -Create a data file in your component's data directory (one table per file): +First, let’s make a `data` directory in our app and create a file called `dogs.json`: ```json { "database": "myapp", - "table": "users", + "table": "Dog", "records": [ { "id": 1, - "username": "admin", - "email": "admin@example.com", - "role": "administrator" + "name": "Harper", + "breed": "Labrador", + "age": 3, + "tricks": ["sit"] }, { "id": 2, - "username": "user1", - "email": "user1@example.com", - "role": "standard" + "name": "Balto", + "breed": "Husky", + "age": 5, + "tricks": ["run", "pull sled"] } ] } ``` -### Multiple Tables +This file tells Harper: _“Insert these two records into the `Dog` table when this app runs.”_ -To load data into multiple tables, create separate data files for each table: +## Connecting the Data Loader -**users.json:** - -```json -{ - "database": "myapp", - "table": "users", - "records": [ - { - "id": 1, - "username": "admin", - "email": "admin@example.com" - } - ] -} -``` - -**settings.yaml:** - -```yaml -database: myapp -table: settings -records: - - id: 1 - setting_name: app_name - setting_value: My Application - - id: 2 - setting_name: version - setting_value: '1.0.0' -``` - -## File Organization - -You can organize your data files in various ways: - -### Single File Pattern +Next, let’s tell Harper to use this file when running the application. Open `config.yaml` in the root of your project and add: ```yaml dataLoader: - files: 'data/seed-data.json' + files: 'data/dogs.json' ``` -### Multiple Files Pattern +That’s it. Now the Data Loader knows where to look. -```yaml -dataLoader: - files: - - 'data/users.json' - - 'data/settings.yaml' - - 'data/initial-products.json' -``` +## Running with Data -### Glob Pattern +Go ahead and start your app again: -```yaml -dataLoader: - files: 'data/**/*.{json,yaml,yml}' +```bash +harperdb dev . ``` -## Loading Behavior +This time, when Harper runs, it will automatically read `dogs.json` and load the records into the Dog table. You don’t need to write any import scripts or SQL statements, it just works. -When Harper starts up with a component that includes the Data Loader: +You can confirm the data is there by hitting the endpoint you created earlier: -1. The Data Loader reads all specified data files (JSON or YAML) -1. For each file, it validates that a single table is specified -1. Records are inserted or updated based on timestamp comparison: - - New records are inserted if they don't exist - - Existing records are updated only if the data file's modification time is newer than the record's updated time - - This ensures data files can be safely reloaded without overwriting newer changes -1. If records with the same primary key already exist, updates occur only when the file is newer - -Note: While the Data Loader can create tables automatically by inferring the schema from the provided records, it's recommended to define your table schemas explicitly using the [graphqlSchema](../applications/defining-schemas) component for better control and type safety. - -## Best Practices +```bash +curl http://localhost:9926/Dog/ +``` -1. **Define Schemas First**: While the Data Loader can infer schemas, it's strongly recommended to define your table schemas and relations explicitly using the [graphqlSchema](../applications/defining-schemas) component before loading data. This ensures proper data types, constraints, and relationships between tables. +You should see both `Harper` and `Balto` returned as JSON. -1. **One Table Per File**: Remember that each data file can only load records into a single table. Organize your files accordingly. +### Updating Records -1. **Idempotency**: Design your data files to be idempotent - they should be safe to load multiple times without creating duplicate or conflicting data. +What happens if you change the data file? Let’s update Harper’s age from 3 to 4 in `dogs.json.` -1. **Version Control**: Include your data files in version control to ensure consistency across deployments. +```json +{ + "id": 1, + "name": "Harper", + "breed": "Labrador", + "age": 4, + "tricks": ["sit"] +} +``` -1. **Environment-Specific Data**: Consider using different data files for different environments (development, staging, production). +When you save the file, Harper will notice the change and reload. The next time you query the endpoint, Harper’s age will be updated. -1. **Data Validation**: Ensure your data files are valid JSON or YAML and match your table schemas before deployment. +The Data Loader is designed to be safe and repeatable. If a record already exists, it will only update when the file is newer than the record. This means you can re-run deployments without worrying about duplicates. -1. **Sensitive Data**: Avoid including sensitive data like passwords or API keys directly in data files. Use environment variables or secure configuration management instead. +### Adding More Tables -## Example Component Structure +If your app grows and you want to seed more than just dogs, you can create additional files. For example, a `settings.yaml` file: -``` -my-component/ -├── config.yaml -├── data/ -│ ├── users.json -│ ├── roles.json -│ └── settings.json -├── schemas.graphql -└── roles.yaml +```yaml +database: myapp +table: Settings +records: + - id: 1 + setting_name: app_name + setting_value: Dog Tracker + - id: 2 + setting_name: version + setting_value: '1.0.0' ``` -With this structure, your `config.yaml` might look like: +Then add it to your config: ```yaml -# Load environment variables first -loadEnv: - files: '.env' +dataLoader: + files: + - 'data/dogs.json' + - 'data/settings.yaml' +``` -# Define schemas -graphqlSchema: - files: 'schemas.graphql' +Harper will read both files and load them into their respective tables. -# Define roles -roles: - files: 'roles.yaml' +## Key Takeaway -# Load initial data -dataLoader: - files: 'data/*.json' +With the Data Loader, your app doesn’t start empty. It starts ready to use. You define your schema, write a simple data file, and Harper takes care of loading it. This keeps your applications consistent across environments, safe to redeploy, and quick to get started with. -# Enable REST endpoints -rest: true -``` +In just a few steps, we’ve gone from an empty Dog table to a real application with data that’s instantly queryable. ## Related Documentation -- [Built-In Components](../../reference/components/built-in-extensions) -- [Extensions](../../reference/components/extensions) +- [Data Loader Reference](../../reference/applications/data-loader) – Complete configuration and format options. - [Bulk Operations](../operations-api/bulk-operations) - For loading data via the Operations API diff --git a/versioned_docs/version-4.6/reference/applications/data-loader.md b/versioned_docs/version-4.6/reference/applications/data-loader.md new file mode 100644 index 00000000..46763a94 --- /dev/null +++ b/versioned_docs/version-4.6/reference/applications/data-loader.md @@ -0,0 +1,179 @@ +--- +title: Data Loader +--- + +# Data Loader + +The Data Loader is a built-in component that provides a reliable mechanism for loading data from JSON or YAML files into Harper tables during component deployment. It is typically used to ensure that specific records exist in a database when deploying components, such as seed data, configuration records, or initial application data. + +## Configuration + +Enable the Data Loader in your component’s `config.yaml` file by specifying one or more data files: + +```yaml +dataLoader: + files: 'data/*.json' +``` + +The Data Loader is an Extension and supports the standard `files` configuration option. + +## Data File Format + +Data files must be structured as JSON or YAML and contain records for a single table. + +If you need to load data into multiple tables, create a separate file for each table. + +### Basic Example + +**`users.json`** + +```json +{ + "database": "myapp", + "table": "users", + "records": [ + { + "id": 1, + "username": "admin", + "email": "admin@example.com", + "role": "administrator" + }, + { + "id": 2, + "username": "user1", + "email": "user1@example.com", + "role": "standard" + } + ] +} +``` + +### Multiple Tables + +To load multiple tables, use separate files: +**`users.json`** + +```json +{ + "database": "myapp", + "table": "users", + "records": [ + { + "id": 1, + "username": "admin", + "email": "admin@example.com" + } + ] +} +``` + +**`settings.yaml`** + +```yaml +database: myapp +table: settings +records: + - id: 1 + setting_name: app_name + setting_value: My Application + - id: 2 + setting_name: version + setting_value: '1.0.0' +``` + +## File Organization + +Data files can be referenced in several ways: + +**Single File Pattern** + +```yaml +dataLoader: + files: 'data/seed-data.json' +``` + +**Multiple Files Pattern** + +```yaml +dataLoader: + files: + - 'data/users.json' + - 'data/settings.yaml' + - 'data/initial-products.json' +``` + +**Glob pattern** + +```yaml +dataLoader: + files: 'data/**/*.{json,yaml,yml}' +``` + +## Loading Behavior + +When Harper starts with a component that includes the Data Loader: + +- The Data Loader reads all specified data files (.json, .yaml, .yml). +- For each file, it validates that only one table is specified. +- Records are inserted or updated based on timestamp comparison: + - New records are inserted if they do not exist. + - Existing records are updated only if the data file’s modification time is newer than the record’s last updated time. +- This behavior ensures files can be safely reloaded without overwriting newer changes. + +If a record with the same primary key already exists, updates occur only when the file is newer. + +:::information +Note: The Data Loader can infer table schemas from the provided records, but it is recommended to explicitly define schemas with the `graphqlSchema` component for type safety and better control. +::: + +## Best Practices + +- **Define schemas first** – Explicit schemas ensure correct types, constraints, and relationships. +- **One table per file** – Each file should only define records for one table. +- **Idempotency** – Write data files so they can be safely reloaded without creating duplicates. +- **Version control** – Commit data files to ensure consistent deployments. +- **Environment-specific data** – Use different data files per environment (development, staging, production). +- **Validate data** – Confirm JSON/YAML syntax and schema compatibility before deployment. +- **Avoid sensitive data** – Do not store credentials or API keys in data files. Use environment variables or secure configuration management instead. + +## Example Component Structure + +``` +my-component/ +├── config.yaml +├── data/ +│ ├── users.json +│ ├── roles.json +│ └── settings.json +├── schemas.graphql +└── roles.yaml +``` + +**`config.yaml`** + +```yaml +# Load environment variables +loadEnv: + files: '.env' + +# Define schemas +graphqlSchema: + files: 'schemas.graphql' + +# Define roles +roles: + files: 'roles.yaml' + +# Load initial data +dataLoader: + files: 'data/*.json' + +# Enable REST endpoints +rest: true +``` + +## Related Documentation + +- [Data Loader (Application Guide)](../../developers/applications/data-loader) – Step-by-step walkthrough with examples. +- [Bulk Operations](../../developers/operations-api/bulk-operations) – Load data programmatically via the Operations API. +- [Extensions](../components/extensions) – General reference for Harper Extensions. From bb2b4013086df8a13d308707a1adbcb5cd9d7af2 Mon Sep 17 00:00:00 2001 From: nenharper Date: Wed, 24 Sep 2025 01:56:55 -0500 Subject: [PATCH 2/5] Update versioned_docs/version-4.6/developers/applications/data-loader.md Co-authored-by: Ethan Arrowood --- .../version-4.6/developers/applications/data-loader.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/versioned_docs/version-4.6/developers/applications/data-loader.md b/versioned_docs/version-4.6/developers/applications/data-loader.md index 1d5bafcc..f4aef8ca 100644 --- a/versioned_docs/version-4.6/developers/applications/data-loader.md +++ b/versioned_docs/version-4.6/developers/applications/data-loader.md @@ -4,7 +4,7 @@ title: Data Loader # Data Loader -Now that you’ve set up your first application, let’s bring it to life with some data. Applications are only as useful as the information they hold, and Harper makes it simple to seed your database with initial records, configuration values, or even test users, without needing to write a custom script. This is where the Data Loader comes in. +Now that you’ve set up your first application, let’s bring it to life with some data. Applications are only as useful as the information they hold, and Harper makes it simple to seed your database with initial records, configuration values, or even test users, without needing to write a custom script. This is where the Data Loader plugin comes in. Think of the Data Loader as your shortcut for putting essential data in place from day one. Whether it’s a set of default settings, an admin user account, or sample data for development, the Data Loader ensures that when your application is deployed, it’s immediately usable. From 601ec044e331b70ef349aa7ceec16ef7b301f7f0 Mon Sep 17 00:00:00 2001 From: nenharper Date: Wed, 24 Sep 2025 01:57:05 -0500 Subject: [PATCH 3/5] Update versioned_docs/version-4.6/reference/applications/data-loader.md Co-authored-by: Ethan Arrowood --- .../version-4.6/reference/applications/data-loader.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/versioned_docs/version-4.6/reference/applications/data-loader.md b/versioned_docs/version-4.6/reference/applications/data-loader.md index 46763a94..75cf1368 100644 --- a/versioned_docs/version-4.6/reference/applications/data-loader.md +++ b/versioned_docs/version-4.6/reference/applications/data-loader.md @@ -4,7 +4,7 @@ title: Data Loader # Data Loader -The Data Loader is a built-in component that provides a reliable mechanism for loading data from JSON or YAML files into Harper tables during component deployment. It is typically used to ensure that specific records exist in a database when deploying components, such as seed data, configuration records, or initial application data. +The Data Loader is a built-in plugin that provides a reliable mechanism for loading data from JSON or YAML files into Harper tables during component deployment. It is typically used to ensure that specific records exist in a database when deploying components, such as seed data, configuration records, or initial application data. ## Configuration From 84b7385d6a0d746dfd311428866d0ff36dc33723 Mon Sep 17 00:00:00 2001 From: nenharper Date: Wed, 24 Sep 2025 07:53:49 -0500 Subject: [PATCH 4/5] Address comments --- .../{data-loader.md => loading-data.md} | 27 ++++++++++++------- 1 file changed, 18 insertions(+), 9 deletions(-) rename versioned_docs/version-4.6/developers/applications/{data-loader.md => loading-data.md} (84%) diff --git a/versioned_docs/version-4.6/developers/applications/data-loader.md b/versioned_docs/version-4.6/developers/applications/loading-data.md similarity index 84% rename from versioned_docs/version-4.6/developers/applications/data-loader.md rename to versioned_docs/version-4.6/developers/applications/loading-data.md index f4aef8ca..c6b0e18c 100644 --- a/versioned_docs/version-4.6/developers/applications/data-loader.md +++ b/versioned_docs/version-4.6/developers/applications/loading-data.md @@ -1,8 +1,8 @@ --- -title: Data Loader +title: Loading Data --- -# Data Loader +# Loading Data Now that you’ve set up your first application, let’s bring it to life with some data. Applications are only as useful as the information they hold, and Harper makes it simple to seed your database with initial records, configuration values, or even test users, without needing to write a custom script. This is where the Data Loader plugin comes in. @@ -68,6 +68,12 @@ curl http://localhost:9926/Dog/ You should see both `Harper` and `Balto` returned as JSON. +:::info +💡 Notice the trailing `/` in the URL (`/Dog/`). This tells Harper to return all records in the table. Leaving it off would look for a single record instead. + +For more details on querying tables, resources, and records with the REST plugin, see the [REST reference docs](../../developers/rest). +::: + ### Updating Records What happens if you change the data file? Let’s update Harper’s age from 3 to 4 in `dogs.json.` @@ -88,18 +94,20 @@ The Data Loader is designed to be safe and repeatable. If a record already exist ### Adding More Tables -If your app grows and you want to seed more than just dogs, you can create additional files. For example, a `settings.yaml` file: +If your app grows and you want to seed more than just dogs, you can create additional files. For example, a `breeds.yaml` file: ```yaml database: myapp -table: Settings +table: Breed records: - id: 1 - setting_name: app_name - setting_value: Dog Tracker + name: Labrador + size: Large + lifespan: 12 - id: 2 - setting_name: version - setting_value: '1.0.0' + name: Husky + size: Medium + lifespan: 14 ``` Then add it to your config: @@ -108,7 +116,7 @@ Then add it to your config: dataLoader: files: - 'data/dogs.json' - - 'data/settings.yaml' + - 'data/breeds.yaml' ``` Harper will read both files and load them into their respective tables. @@ -123,3 +131,4 @@ In just a few steps, we’ve gone from an empty Dog table to a real application - [Data Loader Reference](../../reference/applications/data-loader) – Complete configuration and format options. - [Bulk Operations](../operations-api/bulk-operations) - For loading data via the Operations API +- [Plugins](../../reference/components/plugins) – For adding custom functionality to applications. From 65c7e249d66d184f5969b6103bd2e8828de856e9 Mon Sep 17 00:00:00 2001 From: Nathan Heskew Date: Tue, 7 Oct 2025 13:55:20 -0700 Subject: [PATCH 5/5] fixing links --- release-notes/v4-tucker/4.6.0.md | 2 +- .../version-4.6/reference/applications/data-loader.md | 2 +- .../version-4.6/reference/components/built-in-extensions.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/release-notes/v4-tucker/4.6.0.md b/release-notes/v4-tucker/4.6.0.md index 3188224c..6157db8c 100644 --- a/release-notes/v4-tucker/4.6.0.md +++ b/release-notes/v4-tucker/4.6.0.md @@ -25,7 +25,7 @@ An important change is that logging to standard out/error will _not_ include the ### Data Loader -4.6 includes a new [data loader](/docs/developers/applications/data-loader) that can be used to load data into HarperDB as part of a component. The data loader can be used to load data from JSON file and can be deployed and distributed with a component to provide a reliable mechanism for ensuring specific records are loaded into Harper. +4.6 includes a new [data loader](/docs/reference/applications/data-loader) that can be used to load data into HarperDB as part of a component. The data loader can be used to load data from JSON file and can be deployed and distributed with a component to provide a reliable mechanism for ensuring specific records are loaded into Harper. ### Resource API Upgrades diff --git a/versioned_docs/version-4.6/reference/applications/data-loader.md b/versioned_docs/version-4.6/reference/applications/data-loader.md index 75cf1368..189d588a 100644 --- a/versioned_docs/version-4.6/reference/applications/data-loader.md +++ b/versioned_docs/version-4.6/reference/applications/data-loader.md @@ -174,6 +174,6 @@ rest: true ## Related Documentation -- [Data Loader (Application Guide)](../../developers/applications/data-loader) – Step-by-step walkthrough with examples. +- [Data Loader (Application Guide)](../../developers/applications/loading-data) – Step-by-step walkthrough with examples. - [Bulk Operations](../../developers/operations-api/bulk-operations) – Load data programmatically via the Operations API. - [Extensions](../components/extensions) – General reference for Harper Extensions. diff --git a/versioned_docs/version-4.6/reference/components/built-in-extensions.md b/versioned_docs/version-4.6/reference/components/built-in-extensions.md index 49ec5fcb..33dcc7a0 100644 --- a/versioned_docs/version-4.6/reference/components/built-in-extensions.md +++ b/versioned_docs/version-4.6/reference/components/built-in-extensions.md @@ -24,7 +24,7 @@ Load data from JSON or YAML files into Harper tables as part of component deploy This component is an [Extension](..#extensions) and can be configured with the `files` configuration option. -Complete documentation for this feature is available here: [Data Loader](../../developers/applications/data-loader) +Complete documentation for this feature is available here: [Data Loader](../../developers/applications/loading-data) ```yaml dataLoader: