diff --git a/.gitignore b/.gitignore index eaa4e5e..abd38aa 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,101 @@ -itcont.txt -node_modules/ \ No newline at end of file +# Logs +logs +*.log +npm-debug.log* +yarn-debug.log* +yarn-error.log* +lerna-debug.log* + +# Diagnostic reports (https://nodejs.org/api/report.html) +report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json + +# Runtime data +pids +*.pid +*.seed +*.pid.lock + +# Directory for instrumented libs generated by jscoverage/JSCover +lib-cov + +# Coverage directory used by tools like istanbul +coverage +*.lcov + +# nyc test coverage +.nyc_output + +# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files) +.grunt + +# Bower dependency directory (https://bower.io/) +bower_components + +# node-waf configuration +.lock-wscript + +# Compiled binary addons (https://nodejs.org/api/addons.html) +build/Release + +# Dependency directories +node_modules/ +jspm_packages/ + +# TypeScript v1 declaration files +typings/ + +# TypeScript cache +*.tsbuildinfo + +# Optional npm cache directory +.npm + +# Optional eslint cache +.eslintcache + +# Microbundle cache +.rpt2_cache/ +.rts2_cache_cjs/ +.rts2_cache_es/ +.rts2_cache_umd/ + +# Optional REPL history +.node_repl_history + +# Output of 'npm pack' +*.tgz + +# Yarn Integrity file +.yarn-integrity + +# dotenv environment variables file +.env +.env.test + +# parcel-bundler cache (https://parceljs.org/) +.cache + +# next.js build output +.next + +# nuxt.js build output +.nuxt + +# gatsby files +.cache/ +public + +# vuepress build output +.vuepress/dist + +# Serverless directories +.serverless/ + +# FuseBox cache +.fusebox/ + +# DynamoDB Local files +.dynamodb/ + +# Reformatted output data +database/mongodb_version4/reformatted/ diff --git a/.vscode/launch.json b/.vscode/launch.json deleted file mode 100644 index 524e500..0000000 --- a/.vscode/launch.json +++ /dev/null @@ -1,14 +0,0 @@ -{ - // Use IntelliSense to learn about possible attributes. - // Hover to view descriptions of existing attributes. - // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387 - "version": "0.2.0", - "configurations": [ - { - "type": "node", - "request": "launch", - "name": "Launch Program", - "program": "${workspaceFolder}/readFileStream.js" - } - ] -} \ No newline at end of file diff --git a/.vscode/settings.json b/.vscode/settings.json deleted file mode 100644 index 7a73a41..0000000 --- a/.vscode/settings.json +++ /dev/null @@ -1,2 +0,0 @@ -{ -} \ No newline at end of file diff --git a/README.md b/README.md index 5834927..7c7fbee 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,37 @@ -# Node.js Large File / Data Reading & Performance Testing +# Node.js Read Large Files Challenge -This is an example of 3 different ways to use Node.js to process big data files. One file is the Node.js' `fs.readFile()`, another is with Node.js' `fs.createReadSteam()`, and the final is with the help of the NPM module `EventStream`. +The challenge is to efficiently process a really large text file sourced from the Federal Election Commission. The input data consists of records of monetary contributions by individuals to poltitical entities. + +Code provided in this repository is in the form of Node.js scripts. They showcase 3 different approaches to process big data files. One script utilizes the Node.js `fs.readFile()` API, another utilizes `fs.createReadSteam()`, and the final script incorporates the external NPM module `EventStream`. -There is also use of `console.time` and `console.timeEnd` to determine the performance of the 3 different implementations, and which is most efficient processing the files. +## Performance Testing of the Different Large File Reading Strategies + +`console.time` and `console.timeEnd` are used to determine the performance of the 3 different implementations, and which is most efficient processing of the input files. + +### To Download the Really Large FEC File + +The text file to be processed consists of records of politcal campaign contributions by individuals during the 2018 election cycle. -### To Download the Really Large File Download the large file zip here: https://www.fec.gov/files/bulk-downloads/2018/indiv18.zip -The main file in the zip: `itcont.txt`, can only be processed by the `readFileEventStream.js` file, the other two implementations can't handle the 2.55GB file size in memory (Node.js can only hold about 1.5GB in memory at one time).* +### To Download the Dictionary and Header Files + +The indiv18.zip contains files which are essentially in a comma separated values style. There are 21 fields. To make sense of them, you need to get additional files from the data_dictionaries folder. A "Documentation" folder is provided which contains the two files listed below. However, these files apply to the 2018 election data. If the file layouts have changed in subsequent election years, you will need to download the correct ones for the election cycle you are processing. Generally, you will want to download from the Federal Election Commission "bulk downloads" site. The data_dictionaries folder should be checked for files named like the below. Download them if needed: + +bulk-downloads/data_dictionaries/indiv_dictionary.txt + +bulk-downloads/data_dictionaries/indiv_header_file.csv -*Caveat: You can override the standard Node memory limit using the CLI arugment `max-old-space-size=XYZ`. To run, pass in `node --max-old-space-size=8192 .js` (this will increase Node's memory limit to 8gb - just be careful not to make it too large that Node kills off other processes or crashes because its run out of memory) +dictionary.txt explains the data provided in each field of a contribution record. header_file.csv is formatted as a header record in comma separated values format, with one heading for each field provided in the contribution record. + +The indiv18.zip file contains several files in the archive, some of which are quite large. The zip file alone can take 5+ minutes to download, depending on connection speed. + +The main file in the zip archive: `itcont.txt`, is the largest in size at 2.55 GiB. It can only be processed by the `readFileEventStream.js` script file. The other two scripts in this repository can't handle the input file size in memory. Node.js can only hold about 1.5GB in memory at one time.* + +*Caveat: You can override the standard Node memory limit using the CLI arugment `max-old-space-size=XYZ`. To run, pass in `node --max-old-space-size=8192 .js` This will increase Node's memory limit to 8 GiB - just be careful not to make the value so large that Node kills off other processes or crashes because it runs out of memory. ### To Run -Before the first run, run `npm install` from the command line to install the `event-stream` and `performance.now` packages from Node. +Before the first run, run `npm install` from the command line to install the `event-stream` and `performance.now` packages from Node. You may want to check the package.json file to adjust which versions of the external modules you are installing. Add the file path for one of the files (could be the big one `itcont.txt` or any of its smaller siblings in the `indiv18` folder that were just downloaded), and type the command `node ` in the command line. @@ -21,5 +40,34 @@ Then you'll see the answers required from the file printed out to the terminal. ### To Check Performance Testing Use one of the smaller files contained within the `indiv18` folder - they're all about 400MB and can be used with all 3 implementations. Run those along with the `console.time` and `performance.now()` references and you can see which solution is more performant and by how much. +### Option: Put FEC Contribution Records in a MongoDB v4.x Database Collection +It is possible to reformat the input records to a Javascript Object Notation (JSON) format compatible with MongoDB database version 4.x. You must do some additional preparation work. The instructions here assume you are familiar with the Linux command line and Linux-based utilities such as sed and egrep. + +Download and unzip the indiv18.zip file. Download the header file noted above. Make note of the path where you unzipped the contribution files to. +The header file is in comma separated values format, using actual commas ',' as the separator. You must change the separator to a pipe symbol '|'. + +`sed 's/\,/\|/g' < indiv_header_file.csv > test1.csv` + +You must append individual contribution records to this test1.csv file. For testing purposes, use egrep to extract records of interest, such as contributors employed by particular companies. + +`egrep 'PFIZER' itcont_2018_20181228_52010302.txt >> test1.csv` + +Navigate to the database/mongodb_version4 folder. + +Create a new folder named 'reformatted' in that folder. + +On the command line, issue + +`node reformat_fec_data_to_json.js path/to/your/test1.csv` + +The input file test1.csv is reformatted to json and the output file is in the reformatted/ folder that you created. It will have a *.json extension. You can change the name of the output file by changing the writeStream arguments in the reformat_fec_data_to_json.js script. + +You can then import this reformatted data into a MongoDB version 4.x collection using the mongoimport utility, like so: + +`mongoimport --db fecdata --collection t1 --file reformatted/test1.json` + +The advantage of loading this data into a MongoDB collection is that you can then perform aggregation queries on the collection using the db.collection.aggregate() utility of MongoDB. You can also index the collection as you prefer. + +Contributor BobCochran has only tested the script with 271,237 input records. To test the reformatting, Node.js versions 10.16.3 and 12.3.0 were used. The reformatted data was added to a standalone instance of MongoDB Enterprise server version 4.0.13, running in a Ubuntu version 18.04.3 LTS server. diff --git a/database/mongodb_version4/reformat_fec_data_to_json.js b/database/mongodb_version4/reformat_fec_data_to_json.js new file mode 100644 index 0000000..88bf1c8 --- /dev/null +++ b/database/mongodb_version4/reformat_fec_data_to_json.js @@ -0,0 +1,155 @@ +/* This code uses the Node.js readline API to read political campaign + * donation data obtained from the United States Federal Election + * Commission (the "FEC".) Each line of input from the *.txt file + * is reformatted into an output record that is in JSON format, and + * uses the specific data types documented by the MongoDB version 4.x + * database server. + * + * The specific input files being reformatted by this code are the + * SEC records of political donations by individuals, of USD $200.00 + * or more. For example, the "indiv20.zip" file in the FEC bulk + * downloads area contains multiple *.txt files, each of which records + * a political donation of $200.00 or more by a named individual. + * The record layout is provided in the "documentation" folder appearing + * at the root folder of this repository. + * + */ + +const fs = require('fs'); +const readline = require('readline'); + +//Count number of lines + +var lineCount = 0; + +//An array that holds the header line of the csv file. +var myHdr = []; + +const rl = readline.createInterface({ + + input: fs.createReadStream(process.argv[2]), + + crlfDelay: Infinity + +}); + +// Create a writeStream so we can write the reformatted output to a file + +const writeStream = fs.createWriteStream( "./reformatted/test1.json", { encoding: "utf8"} ); + +// Split and save the first line -- treat that as the header line. + +rl.on('line', (line) => { + + lineCount++ + + if (lineCount === 1) { + + /* Code by the original author splits a line using a + * technique like this: + * + * myHdr = line.split('|')[3] + * + * It has the effect of skipping the first 3 elements and + * capturing the fourth element -- and only the fourth. + * What I wish to do is different: split every field out, + * in order to reformat them into json-ified records. + */ + + myHdr = line.split('|') + + console.log('Elements from the header line are ' + myHdr) + + } + + if (lineCount > 1) { + + var myTrans = line.split('|') + + var jstring = "{ " + + for (i = 0; i < 21; i++) { + + /* The 13th index value is the transaction date. This needs to be reformated + * from a MMDDYYYY string to a YYYY-MM-DD string that can be converted to + * ISO8601 date format acceptable to the MongoDB 'mongoimport' utility. + */ + + if (i === 13) { + + var myDateStr = myTrans[i] + + var theISODt = "ISODate\(\"" + myDateStr[4] + myDateStr[5] + myDateStr[6] + myDateStr[7] + "\-" + myDateStr[0] + myDateStr[1] + "\-" + + theISODt = theISODt + myDateStr[2] + myDateStr[3] + "T00\:00\:00Z\"" + "\)" + + jstring = jstring + "\"" + myHdr[i] + "\" : " + theISODt + "\, " + + } + + /* The 14th index value is the transaction amount field. Reformat this into a + * $numberDecimal value (also known as Decimal128.) The value has to be formatted + * like so: "TRANSACTION_AMT" : {"$numberDecimal" : "120.00"} + */ + + else if (i === 14) { + + var myAmt = myTrans[i] + + /* Is the amount field a real number? */ + + if (myAmt !== "") { + + var theContr = "\{\"\$numberDecimal\" \: \"" + myAmt + "\.00\"\}" + + jstring = jstring + "\"" + myHdr[i] + "\" : " + theContr + "\, " + +// console.log("The myTrans array " + myTrans) + +// console.log("The myAmt value " + myAmt) + +// console.log("The typeof for myAmt " + typeof myAmt) + + } else { + + + var theContr = "\{\"\$numberDecimal\" \: \"0" + "\.00\"\}" + + jstring = jstring + "\"" + myHdr[i] + "\" : " + theContr + "\, " + + + } + + } + + /* The 20th index value is the final field to be reformatted. We want to close the + * string with a valid JSON closing brace. + */ + + else if (i === 20) { + + jstring = jstring + "\"" + myHdr[i] + "\" : " + "\"" + myTrans[i] + "\"" + " \}" + + } else { + + jstring = jstring + "\"" + myHdr[i] + "\" : " + "\"" + myTrans[i] + "\"\, " + + } + + } + +// console.log(jstring) + + writeStream.write(jstring) + +} } ); + +rl.on('close', () => { + + console.log('Number of lines processed is ' + lineCount) + +}) + +function isEmptyOrSpaces(str){ + return str === null || str.match(/^ *$/) !== null; +} diff --git a/database/mongodb_version4/test_array2.js b/database/mongodb_version4/test_array2.js new file mode 100644 index 0000000..87f9ead --- /dev/null +++ b/database/mongodb_version4/test_array2.js @@ -0,0 +1,36 @@ +function splitString(stringToSplit, separator) { + const arrayOfStrings = stringToSplit.split(separator); + + console.log('The original string is: "' + stringToSplit + '"'); + console.log('The separator is: "' + separator + '"'); + + if (arrayOfStrings[14] === "") { + arrayOfStrings[14] = 0 + console.log("The transaction amount has been replaced.") + } + + if (arrayOfStrings.includes(undefined)) { + + console.log("There are undefined or empty elements in the arrayOfStrings") + } + console.log("The Object.values " + Object.values(arrayOfStrings)) + console.log(Object.values(arrayOfStrings).length) + console.log(arrayOfStrings.length) + console.log('The array has ' + arrayOfStrings.length + ' elements: ' + arrayOfStrings.join('/')); +} + +const tempestString = 'Oh brave new world that has such people in it.'; +const monthString = 'Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec'; +const monthString2 = 'Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep||Nov|Dec'; +const fecString = 'C00339655|N|YE|P|201901179143856769|15|IND|COCHRAN, ERNEST W|PARIS|TX|754606333|TEXAS ONCOLOGY, P.A.|PHYSICIAN SHAREHOLDER MED ONC|12312018|||201901021615-165|1305336|||4021920191640570973' + +const space = ' '; +const comma = ','; +const pipe = '|' + +//splitString(tempestString, space); +//splitString(tempestString); +//splitString(monthString2, pipe); +splitString(fecString, pipe) + + diff --git a/documentation/indiv_dictionary.txt b/documentation/indiv_dictionary.txt new file mode 100644 index 0000000..109fee8 --- /dev/null +++ b/documentation/indiv_dictionary.txt @@ -0,0 +1,288 @@ +revised 09/29/2008 +INDIVIDUAL CONTRIBUTIONS FILE +Federal Election Commission +999 E Street, NW +Washington, DC 20463 + +DATA DESCRIPTION +Individual Contributions File + +The zipped files should be downloaded as binary and unzipped. + +Summary: The individual contributions file contains each contribution from an individual to a federal committee if the contribution was at least $200. + +Universe: All individual contributions $200 or more. + +Associated Files: + +Data File: INDIVXX.ZIP +Frequency Counts: INDIVXX.TXT +Data Dictionary: INDIV_DICTIONARY.TXT + + +The variables have been formatted in the following ways: + + + +Mnemonics +Currently +Used by the Field +CommissionVariable Columns Desc. + + +--------------------------------------------------------------- + +ITEM-FILER Filer Identification Number 1-9 9s +ITEM-AMEND Amendment Indicator 10 1s +ITEM-REPT Report Type 11-13 3s +ITEM-PGI Primary-General Indicator 14 1s +ITEM-MICRO Microfilm Location (YYOORRRFFFF) 15-25 11s +ITEM-TRANS Transaction Type 26-28 3s +ITEM-NAME Contributor/Lender/Transfer Name 29-62 34s +ITEM-CTY City/Town 63-80 18s +ITEM-ST State 81-82 2s +ITEM-ZIP Zip Code 83-87 5s +ITEM-OCCU Occupation 88-122 35s +IT-TMN Transaction Date - Month 123-124 2d +IT-TDY Transaction Date - Day 125-126 2d +IT-TCC Transaction Date - Century 127-128 2d +IT-TYY Transaction Date - Year 129-130 2d +ITEM-AMT Amount 131-137 7n +ITEM-OID Other Identification Number 138-146 9s +ITEM-RN FEC Record Number 147-153 7s + +Data Type: s = string (alpha or alpha-numeric); d = date; n = numeric + + +Variable Documentation + + + Filer Identification Number + Columns 1-9 + String + +A 9-character alpha-numeric code assigned to a committee by the Federal Election Commission. + + --------- + Amendment Indicator + Columns 10-10 + String + +A AMENDMENT +C CONSOLIDATED +M MULTI-CANDIDATE +N NEW +S SECONDARY +T TERMINATED + +Indicates if the report being filed is new (N), an Amendment (A) to a previous report, or a termination (T) report. + + --------- + Report Type + Columns 11-13 + String + +Indicates the type of report filed. + +10D PRE-ELECTION +10G PRE-GENERAL +10P PRE-PRIMARY +10R PRE-RUN-OFF +10S PRE-SPECIAL +12C PRE-CONVENTION +12G PRE-GENERAL +12P PRE-PRIMARY +12R PRE-RUN-OFF +12S PRE-SPECIAL +30D POST-ELECTION +30G POST-GENERAL +30P POST-PRIMARY +30R POST-RUN-OFF +30S POST-SPECIAL +60D POST-ELECTION +ADJ COMP ADJUST AMEND +CA COMPREHENSIVE AMEND +M1 JANUARY MONTHLY +M10 OCTOBER MONTHLY +M11 NOVEMBER MONTHLY +M12 DECEMBER MONTHLY +M2 FEBRUARY MONTHLY +M3 MARCH MONTHLY +M4 APRIL MONTHLY +M5 MAY MONTHLY +M6 JUNE MONTHLY +M7 JULY MONTHLY +M8 AUGUST MONTHLY +M9 SEPTEMBER MONTHLY +MY MID-YEAR REPORT +Q1 APRIL QUARTERLY +Q2 JULY QUARTERLY +Q3 OCTOBER QUARTERLY +TER TERMINATION REPORT +YE YEAR-END +90S POST INAUGURAL SUPPLEMENT +90D POST INAUGURAL +48H 48 HOUR NOTIFICATION +24H 24 HOUR NOTIFICATION +. + +-------- + Primary-General Indicator + Columns 14-14 + String + +C CONVENTION +G GENERAL +P PRIMARY +R RUNOFF +S SPECIAL + +This code indicates the type of election or if the committee is retiring debt. Numeric codes are for those committees that are retiring previous election cycle debt. Alpha codes are for those committees active in the current election cycle. + + --------- + Microfilm Location (YYOORRRFFFF) + Columns 15-25 + String + +Indicates the physical location of the filing. + + --------- + Transaction Type + Columns 26-28 + String + +10 NON-FEDERAL RECEIPT FROM PERSONS LEVIN (L-1A) +11 TRIBAL CONTRIBUTION +12 NON-FEDERAL OTHER RECEIPT LEVIN (L-2) +13 INAUGURAL DONATION ACCEPTED +15 CONTRIBUTION +15C CONTRIBUTION FROM CANDIDATE +15E EARMARKED CONTRIBUTION +15F LOANS FORGIVEN BY CANDIDATE +15I EARMARKED INTERMEDIARY IN +15J MEMO (FILER'S % OF CONTRIBUTION GIVEN TO JOIN +15T EARMARKED INTERMEDIARY TREASURY IN +15Z IN-KIND CONTRIBUTION RECEIVED FROM REGISTERED +16C LOANS RECEIVED FROM THE CANDIDATE +16F LOANS RECEIVED FROM BANKS +16G LOAN FROM INDIVIDUAL +16H LOAN FROM CANDIDATE/COMMITTEE +16J LOAN REPAYMENTS FROM INDIVIDUAL +16K LOAN REPAYMENTS FROM CANDIDATE/COMMITTEE +16L LOAN REPAYMENTS RECEIVED FROM UNREGISTERED EN +16R LOANS RECEIVED FROM REGISTERED FILERS +16U LOAN RECEIVED FROM UNREGISTERED ENTITY +17R CONTRIBUTION REFUND RECEIVED FROM REGISTERED +17U REF/REB/RET RECEIVED FROM UNREGISTERED ENTITY +17Y REF/REB/RET FROM INDIVIDUAL/CORPORATION +17Z REF/REB/RET FROM CANDIDATE/COMMITTEE +18G TRANSFER IN AFFILIATED +18H HONORARIUM RECEIVED +18J MEMO (FILER'S % OF CONTRIBUTION GIVEN TO JOIN +18K CONTRIBUTION RECEIVED FROM REGISTERED FILER +18S RECEIPTS FROM SECRETARY OF STATE +18U CONTRIBUTION RECEIVED FROM UNREGISTERED COMMI +19 ELECTIONEERING COMMUNICATION DONATION RECEIVE +19J MEMO (ELECTIONEERING COMMUNICATION % OF DONAT +20 DISBURSEMENT - EXEMPT FROM LIMITS +20A NON-FEDERAL DISBURSEMENT LEVIN (L-4A) VOTER R +20B NON-FEDERAL DISBURSEMENT LEVIN (L-4B) VOTER I +20C LOAN REPAYMENTS MADE TO CANDIDATE +20D NON-FEDERAL DISBURSEMENT LEVIN (L-4D) GENERIC +20F LOAN REPAYMENTS MADE TO BANKS +20G LOAN REPAYMENTS MADE TO INDIVIDUAL +20R LOAN REPAYMENTS MADE TO REGISTERED FILER +20V NON-FEDERAL DISBURSEMENT LEVIN (L-4C) GET OUT +22G LOAN TO INDIVIDUAL +22H LOAN TO CANDIDATE/COMMITTEE +22J LOAN REPAYMENT TO INDIVIDUAL +22K LOAN REPAYMENT TO CANDIDATE/COMMITTEE +22L LOAN REPAYMENT TO BANK +22R CONTRIBUTION REFUND TO UNREGISTERED ENTITY +22U LOAN REPAID TO UNREGISTERED ENTITY +22X LOAN MADE TO UNREGISTERED ENTITY +22Y CONTRIBUTION REFUND TO INDIVIDUAL +22Z CONTRIBUTION REFUND TO CANDIDATE/COMMITTEE +23Y INAUGURAL DONATION REFUND +24A INDEPENDENT EXPENDITURE AGAINST +24C COORDINATED EXPENDITURE +24E INDEPENDENT EXPENDITURE FOR +24F COMMUNICATION COST FOR CANDIDATE (C7) +24G TRANSFER OUT AFFILIATED +24H HONORARIUM TO CANDIDATE +24I EARMARKED INTERMEDIARY OUT +24K CONTRIBUTION MADE TO NON-AFFILIATED +24N COMMUNICATION COST AGAINST CANDIDATE (C7) +24P CONTRIBUTION MADE TO POSSIBLE CANDIDATE +24R ELECTION RECOUNT DISBURSEMENT +24T EARMARKED INTERMEDIARY TREASURY OUT +24U CONTRIBUTION MADE TO UNREGISTERED +24Z IN-KIND CONTRIBUTION MADE TO REGISTERED FILER +29 ELECTIONEERING COMMUNICATION DISBURSEMENT(S) + + + --------- + Name (Contributor / Lender / Transfer) + Columns 20-62 + String + +Reported name of the contributor. + + --------- + City + Columns 63-80 + String + + --------- + State + Columns 81-82 + String + + --------- + US Postal ZIP Code + Columns 83-87 + String + + Note: City, State, and ZIP Code information are reported. + + --------- + Occupation + Columns 88-122 + String + + Reported occupation of donor. + + --------- + Columns 123-124 + Date + + --------- + Day + Columns 125-126 + Date + + --------- + Year + Columns 127-130 + Date + + --------- + Amount + Columns 131-137 + Numeric + + In the fixed width text file, the amounts are in COBOL format. If the value is negative, the right most column will contain a special character: ] = -0, j = -1, k = -2, l = -3, m = -4, n = -5, o = -6, p = -7, q = -8, and r = -9. + + + --------- + Other Identification Number + Columns 138-146 + String + +For contributions from individuals this variable is null. For contributions from candidates or other committees this variable will indicate that contributor. + + --------- + FEC Record Number + Columns 147-153 + String + diff --git a/documentation/indiv_header_file.csv b/documentation/indiv_header_file.csv new file mode 100644 index 0000000..50e8636 --- /dev/null +++ b/documentation/indiv_header_file.csv @@ -0,0 +1 @@ +CMTE_ID,AMNDT_IND,RPT_TP,TRANSACTION_PGI,IMAGE_NUM,TRANSACTION_TP,ENTITY_TP,NAME,CITY,STATE,ZIP_CODE,EMPLOYER,OCCUPATION,TRANSACTION_DT,TRANSACTION_AMT,OTHER_ID,TRAN_ID,FILE_NUM,MEMO_CD,MEMO_TEXT,SUB_ID