|
| 1 | +# gulp-etl-tap-mysql # |
| 2 | + |
| 3 | +*(this plugin is being developed from **gulp-etl-tap-mysql**. The original readme from [gulp-etl-tap-csv](https://github.com/gulpetl/gulp-etl-tap-csv) is below)* |
| 4 | + |
| 5 | +This plugin converts CSV files to **gulp-etl** **Message Stream** files; originally adapted from the [gulp-etl-handlelines](https://github.com/gulpetl/gulp-etl-handlelines) model plugin. It is a **gulp-etl** wrapper for [csv-parse](https://csv.js.org/parse/). |
| 6 | + |
| 7 | +This is a **[gulp-etl](https://gulpetl.com/)** plugin, and as such it is a [gulp](https://gulpjs.com/) plugin. **gulp-etl** plugins work with [ndjson](http://ndjson.org/) data streams/files which we call **Message Streams** and which are compliant with the [Singer specification](https://github.com/singer-io/getting-started/blob/master/docs/SPEC.md#output). In the **gulp-etl** ecosystem, **taps** tap into an outside format or system (in this case, a CSV file) and convert their contents/output to a Message Stream, and **targets** convert/output Message Streams to an outside format or system. In this way, these modules can be stacked to convert from one format or system to another, either directly or with tranformations or other parsing in between. Message Streams look like this: |
| 8 | + |
| 9 | +``` |
| 10 | +{"type": "SCHEMA", "stream": "users", "key_properties": ["id"], "schema": {"required": ["id"], "type": "object", "properties": {"id": {"type": "integer"}}}} |
| 11 | +{"type": "RECORD", "stream": "users", "record": {"id": 1, "name": "Chris"}} |
| 12 | +{"type": "RECORD", "stream": "users", "record": {"id": 2, "name": "Mike"}} |
| 13 | +{"type": "SCHEMA", "stream": "locations", "key_properties": ["id"], "schema": {"required": ["id"], "type": "object", "properties": {"id": {"type": "integer"}}}} |
| 14 | +{"type": "RECORD", "stream": "locations", "record": {"id": 1, "name": "Philadelphia"}} |
| 15 | +{"type": "STATE", "value": {"users": 2, "locations": 1}} |
| 16 | +``` |
| 17 | + |
| 18 | +### Usage |
| 19 | +**gulp-etl** plugins accept a configObj as the first parameter; the configObj |
| 20 | +will contain any info the plugin needs. For this plugin the configObj is the "Options" object for [csv-parse](https://csv.js.org/parse/), described [here](https://csv.js.org/parse/options/); the only difference is that the "columns" property cannot be falsey, since it would result in arrays being returned |
| 21 | +for each row instead of objects. A falsey value for columns will be overridden to true. |
| 22 | + |
| 23 | +##### Sample gulpfile.js |
| 24 | +``` |
| 25 | +/* parse all .CSV files in a folder into Message Stream files in a different folder */ |
| 26 | +
|
| 27 | +let gulp = require('gulp') |
| 28 | +var rename = require('gulp-rename') |
| 29 | +var tapCsv = require('gulp-etl-tap-csv').tapCsv |
| 30 | +
|
| 31 | +exports.default = function() { |
| 32 | + return gulp.src('data/*.csv') |
| 33 | + .pipe(tapCsv({ columns:true })) |
| 34 | + .pipe(rename({ extname: ".ndjson" })) // rename to *.ndjson |
| 35 | + .pipe(gulp.dest('output/')); |
| 36 | +} |
| 37 | +``` |
| 38 | +### Quick Start for Coding on This Plugin |
| 39 | +* Dependencies: |
| 40 | + * [git](https://git-scm.com/downloads) |
| 41 | + * [nodejs](https://nodejs.org/en/download/releases/) - At least v6.3 (6.9 for Windows) required for TypeScript debugging |
| 42 | + * npm (installs with Node) |
| 43 | + * typescript - installed as a development dependency |
| 44 | +* Clone this repo and run `npm install` to install npm packages |
| 45 | +* Debug: with [VScode](https://code.visualstudio.com/download) use `Open Folder` to open the project folder, then hit F5 to debug. This runs without compiling to javascript using [ts-node](https://www.npmjs.com/package/ts-node) |
| 46 | +* Test: `npm test` or `npm t` |
| 47 | +* Compile to javascript: `npm run build` |
| 48 | + |
| 49 | +### Testing |
| 50 | + |
| 51 | +We are using [Jest](https://facebook.github.io/jest/docs/en/getting-started.html) for our testing. Each of our tests are in the `test` folder. |
| 52 | + |
| 53 | +- Run `npm test` to run the test suites |
| 54 | + |
| 55 | + |
| 56 | + |
| 57 | +Note: This document is written in [Markdown](https://daringfireball.net/projects/markdown/). We like to use [Typora](https://typora.io/) and [Markdown Preview Plus](https://chrome.google.com/webstore/detail/markdown-preview-plus/febilkbfcbhebfnokafefeacimjdckgl?hl=en-US) for our Markdown work.. |
0 commit comments