diff --git a/release-notes/v4-tucker/4.6.0.md b/release-notes/v4-tucker/4.6.0.md index 3188224c..6157db8c 100644 --- a/release-notes/v4-tucker/4.6.0.md +++ b/release-notes/v4-tucker/4.6.0.md @@ -25,7 +25,7 @@ An important change is that logging to standard out/error will _not_ include the ### Data Loader -4.6 includes a new [data loader](/docs/developers/applications/data-loader) that can be used to load data into HarperDB as part of a component. The data loader can be used to load data from JSON file and can be deployed and distributed with a component to provide a reliable mechanism for ensuring specific records are loaded into Harper. +4.6 includes a new [data loader](/docs/reference/applications/data-loader) that can be used to load data into HarperDB as part of a component. The data loader can be used to load data from JSON file and can be deployed and distributed with a component to provide a reliable mechanism for ensuring specific records are loaded into Harper. ### Resource API Upgrades diff --git a/versioned_docs/version-4.6/developers/applications/data-loader.md b/versioned_docs/version-4.6/developers/applications/data-loader.md deleted file mode 100644 index ba9f433c..00000000 --- a/versioned_docs/version-4.6/developers/applications/data-loader.md +++ /dev/null @@ -1,181 +0,0 @@ ---- -title: Data Loader ---- - -# Data Loader - -The Data Loader is a built-in component that provides a reliable mechanism for loading data from JSON or YAML files into Harper tables as part of component deployment. This feature is particularly useful for ensuring specific records exist in your database when deploying components, such as seed data, configuration records, or initial application data. - -## Configuration - -To use the Data Loader, first specify your data files in the `config.yaml` in your component directory: - -```yaml -dataLoader: - files: 'data/*.json' -``` - -The Data Loader is an [Extension](../../reference/components#extensions) and supports the standard `files` configuration option. - -## Data File Format - -Data files can be structured as either JSON or YAML files containing the records you want to load. Each data file must specify records for a single table - if you need to load data into multiple tables, create separate data files for each table. - -### Basic Example - -Create a data file in your component's data directory (one table per file): - -```json -{ - "database": "myapp", - "table": "users", - "records": [ - { - "id": 1, - "username": "admin", - "email": "admin@example.com", - "role": "administrator" - }, - { - "id": 2, - "username": "user1", - "email": "user1@example.com", - "role": "standard" - } - ] -} -``` - -### Multiple Tables - -To load data into multiple tables, create separate data files for each table: - -**users.json:** - -```json -{ - "database": "myapp", - "table": "users", - "records": [ - { - "id": 1, - "username": "admin", - "email": "admin@example.com" - } - ] -} -``` - -**settings.yaml:** - -```yaml -database: myapp -table: settings -records: - - id: 1 - setting_name: app_name - setting_value: My Application - - id: 2 - setting_name: version - setting_value: '1.0.0' -``` - -## File Organization - -You can organize your data files in various ways: - -### Single File Pattern - -```yaml -dataLoader: - files: 'data/seed-data.json' -``` - -### Multiple Files Pattern - -```yaml -dataLoader: - files: - - 'data/users.json' - - 'data/settings.yaml' - - 'data/initial-products.json' -``` - -### Glob Pattern - -```yaml -dataLoader: - files: 'data/**/*.{json,yaml,yml}' -``` - -## Loading Behavior - -When Harper starts up with a component that includes the Data Loader: - -1. The Data Loader reads all specified data files (JSON or YAML) -1. For each file, it validates that a single table is specified -1. Records are inserted or updated based on timestamp comparison: - - New records are inserted if they don't exist - - Existing records are updated only if the data file's modification time is newer than the record's updated time - - This ensures data files can be safely reloaded without overwriting newer changes -1. If records with the same primary key already exist, updates occur only when the file is newer - -Note: While the Data Loader can create tables automatically by inferring the schema from the provided records, it's recommended to define your table schemas explicitly using the [graphqlSchema](../applications/defining-schemas) component for better control and type safety. - -## Best Practices - -1. **Define Schemas First**: While the Data Loader can infer schemas, it's strongly recommended to define your table schemas and relations explicitly using the [graphqlSchema](../applications/defining-schemas) component before loading data. This ensures proper data types, constraints, and relationships between tables. - -1. **One Table Per File**: Remember that each data file can only load records into a single table. Organize your files accordingly. - -1. **Idempotency**: Design your data files to be idempotent - they should be safe to load multiple times without creating duplicate or conflicting data. - -1. **Version Control**: Include your data files in version control to ensure consistency across deployments. - -1. **Environment-Specific Data**: Consider using different data files for different environments (development, staging, production). - -1. **Data Validation**: Ensure your data files are valid JSON or YAML and match your table schemas before deployment. - -1. **Sensitive Data**: Avoid including sensitive data like passwords or API keys directly in data files. Use environment variables or secure configuration management instead. - -## Example Component Structure - -``` -my-component/ -├── config.yaml -├── data/ -│ ├── users.json -│ ├── roles.json -│ └── settings.json -├── schemas.graphql -└── roles.yaml -``` - -With this structure, your `config.yaml` might look like: - -```yaml -# Load environment variables first -loadEnv: - files: '.env' - -# Define schemas -graphqlSchema: - files: 'schemas.graphql' - -# Define roles -roles: - files: 'roles.yaml' - -# Load initial data -dataLoader: - files: 'data/*.json' - -# Enable REST endpoints -rest: true -``` - -## Related Documentation - -- [Built-In Components](../../reference/components/built-in-extensions) -- [Extensions](../../reference/components/extensions) -- [Bulk Operations](../operations-api/bulk-operations) - For loading data via the Operations API diff --git a/versioned_docs/version-4.6/developers/applications/loading-data.md b/versioned_docs/version-4.6/developers/applications/loading-data.md new file mode 100644 index 00000000..c6b0e18c --- /dev/null +++ b/versioned_docs/version-4.6/developers/applications/loading-data.md @@ -0,0 +1,134 @@ +--- +title: Loading Data +--- + +# Loading Data + +Now that you’ve set up your first application, let’s bring it to life with some data. Applications are only as useful as the information they hold, and Harper makes it simple to seed your database with initial records, configuration values, or even test users, without needing to write a custom script. This is where the Data Loader plugin comes in. + +Think of the Data Loader as your shortcut for putting essential data in place from day one. Whether it’s a set of default settings, an admin user account, or sample data for development, the Data Loader ensures that when your application is deployed, it’s immediately usable. + +In this section, we’ll add a few dogs to our `Dog` table so our application starts with meaningful data. + +## Creating a Data File + +First, let’s make a `data` directory in our app and create a file called `dogs.json`: + +```json +{ + "database": "myapp", + "table": "Dog", + "records": [ + { + "id": 1, + "name": "Harper", + "breed": "Labrador", + "age": 3, + "tricks": ["sit"] + }, + { + "id": 2, + "name": "Balto", + "breed": "Husky", + "age": 5, + "tricks": ["run", "pull sled"] + } + ] +} +``` + +This file tells Harper: _“Insert these two records into the `Dog` table when this app runs.”_ + +## Connecting the Data Loader + +Next, let’s tell Harper to use this file when running the application. Open `config.yaml` in the root of your project and add: + +```yaml +dataLoader: + files: 'data/dogs.json' +``` + +That’s it. Now the Data Loader knows where to look. + +## Running with Data + +Go ahead and start your app again: + +```bash +harperdb dev . +``` + +This time, when Harper runs, it will automatically read `dogs.json` and load the records into the Dog table. You don’t need to write any import scripts or SQL statements, it just works. + +You can confirm the data is there by hitting the endpoint you created earlier: + +```bash +curl http://localhost:9926/Dog/ +``` + +You should see both `Harper` and `Balto` returned as JSON. + +:::info +💡 Notice the trailing `/` in the URL (`/Dog/`). This tells Harper to return all records in the table. Leaving it off would look for a single record instead. + +For more details on querying tables, resources, and records with the REST plugin, see the [REST reference docs](../../developers/rest). +::: + +### Updating Records + +What happens if you change the data file? Let’s update Harper’s age from 3 to 4 in `dogs.json.` + +```json +{ + "id": 1, + "name": "Harper", + "breed": "Labrador", + "age": 4, + "tricks": ["sit"] +} +``` + +When you save the file, Harper will notice the change and reload. The next time you query the endpoint, Harper’s age will be updated. + +The Data Loader is designed to be safe and repeatable. If a record already exists, it will only update when the file is newer than the record. This means you can re-run deployments without worrying about duplicates. + +### Adding More Tables + +If your app grows and you want to seed more than just dogs, you can create additional files. For example, a `breeds.yaml` file: + +```yaml +database: myapp +table: Breed +records: + - id: 1 + name: Labrador + size: Large + lifespan: 12 + - id: 2 + name: Husky + size: Medium + lifespan: 14 +``` + +Then add it to your config: + +```yaml +dataLoader: + files: + - 'data/dogs.json' + - 'data/breeds.yaml' +``` + +Harper will read both files and load them into their respective tables. + +## Key Takeaway + +With the Data Loader, your app doesn’t start empty. It starts ready to use. You define your schema, write a simple data file, and Harper takes care of loading it. This keeps your applications consistent across environments, safe to redeploy, and quick to get started with. + +In just a few steps, we’ve gone from an empty Dog table to a real application with data that’s instantly queryable. + +## Related Documentation + +- [Data Loader Reference](../../reference/applications/data-loader) – Complete configuration and format options. +- [Bulk Operations](../operations-api/bulk-operations) - For loading data via the Operations API +- [Plugins](../../reference/components/plugins) – For adding custom functionality to applications. diff --git a/versioned_docs/version-4.6/reference/applications/data-loader.md b/versioned_docs/version-4.6/reference/applications/data-loader.md new file mode 100644 index 00000000..189d588a --- /dev/null +++ b/versioned_docs/version-4.6/reference/applications/data-loader.md @@ -0,0 +1,179 @@ +--- +title: Data Loader +--- + +# Data Loader + +The Data Loader is a built-in plugin that provides a reliable mechanism for loading data from JSON or YAML files into Harper tables during component deployment. It is typically used to ensure that specific records exist in a database when deploying components, such as seed data, configuration records, or initial application data. + +## Configuration + +Enable the Data Loader in your component’s `config.yaml` file by specifying one or more data files: + +```yaml +dataLoader: + files: 'data/*.json' +``` + +The Data Loader is an Extension and supports the standard `files` configuration option. + +## Data File Format + +Data files must be structured as JSON or YAML and contain records for a single table. + +If you need to load data into multiple tables, create a separate file for each table. + +### Basic Example + +**`users.json`** + +```json +{ + "database": "myapp", + "table": "users", + "records": [ + { + "id": 1, + "username": "admin", + "email": "admin@example.com", + "role": "administrator" + }, + { + "id": 2, + "username": "user1", + "email": "user1@example.com", + "role": "standard" + } + ] +} +``` + +### Multiple Tables + +To load multiple tables, use separate files: +**`users.json`** + +```json +{ + "database": "myapp", + "table": "users", + "records": [ + { + "id": 1, + "username": "admin", + "email": "admin@example.com" + } + ] +} +``` + +**`settings.yaml`** + +```yaml +database: myapp +table: settings +records: + - id: 1 + setting_name: app_name + setting_value: My Application + - id: 2 + setting_name: version + setting_value: '1.0.0' +``` + +## File Organization + +Data files can be referenced in several ways: + +**Single File Pattern** + +```yaml +dataLoader: + files: 'data/seed-data.json' +``` + +**Multiple Files Pattern** + +```yaml +dataLoader: + files: + - 'data/users.json' + - 'data/settings.yaml' + - 'data/initial-products.json' +``` + +**Glob pattern** + +```yaml +dataLoader: + files: 'data/**/*.{json,yaml,yml}' +``` + +## Loading Behavior + +When Harper starts with a component that includes the Data Loader: + +- The Data Loader reads all specified data files (.json, .yaml, .yml). +- For each file, it validates that only one table is specified. +- Records are inserted or updated based on timestamp comparison: + - New records are inserted if they do not exist. + - Existing records are updated only if the data file’s modification time is newer than the record’s last updated time. +- This behavior ensures files can be safely reloaded without overwriting newer changes. + +If a record with the same primary key already exists, updates occur only when the file is newer. + +:::information +Note: The Data Loader can infer table schemas from the provided records, but it is recommended to explicitly define schemas with the `graphqlSchema` component for type safety and better control. +::: + +## Best Practices + +- **Define schemas first** – Explicit schemas ensure correct types, constraints, and relationships. +- **One table per file** – Each file should only define records for one table. +- **Idempotency** – Write data files so they can be safely reloaded without creating duplicates. +- **Version control** – Commit data files to ensure consistent deployments. +- **Environment-specific data** – Use different data files per environment (development, staging, production). +- **Validate data** – Confirm JSON/YAML syntax and schema compatibility before deployment. +- **Avoid sensitive data** – Do not store credentials or API keys in data files. Use environment variables or secure configuration management instead. + +## Example Component Structure + +``` +my-component/ +├── config.yaml +├── data/ +│ ├── users.json +│ ├── roles.json +│ └── settings.json +├── schemas.graphql +└── roles.yaml +``` + +**`config.yaml`** + +```yaml +# Load environment variables +loadEnv: + files: '.env' + +# Define schemas +graphqlSchema: + files: 'schemas.graphql' + +# Define roles +roles: + files: 'roles.yaml' + +# Load initial data +dataLoader: + files: 'data/*.json' + +# Enable REST endpoints +rest: true +``` + +## Related Documentation + +- [Data Loader (Application Guide)](../../developers/applications/loading-data) – Step-by-step walkthrough with examples. +- [Bulk Operations](../../developers/operations-api/bulk-operations) – Load data programmatically via the Operations API. +- [Extensions](../components/extensions) – General reference for Harper Extensions. diff --git a/versioned_docs/version-4.6/reference/components/built-in-extensions.md b/versioned_docs/version-4.6/reference/components/built-in-extensions.md index 49ec5fcb..33dcc7a0 100644 --- a/versioned_docs/version-4.6/reference/components/built-in-extensions.md +++ b/versioned_docs/version-4.6/reference/components/built-in-extensions.md @@ -24,7 +24,7 @@ Load data from JSON or YAML files into Harper tables as part of component deploy This component is an [Extension](..#extensions) and can be configured with the `files` configuration option. -Complete documentation for this feature is available here: [Data Loader](../../developers/applications/data-loader) +Complete documentation for this feature is available here: [Data Loader](../../developers/applications/loading-data) ```yaml dataLoader: