dummy_file_generator

version 1.1.21

Dummy .csv, flat text or json files generator written in Python 3.7

This tool is able to generate dummy csv, flat text or json files based on the configuration settings you setup for your project(s).

How to install and run the tool as CLI
How to install and run the tool as an imported package
How to setup a new dummy file generator project
How to add a new source dataset for your project
Developer information (for further tool development)

How to install and run the tool as CLI

One common usage scenario can be load / stress / performance testing of file-processing data tools, allowing you to generate the files needed from a command line.

To install:

git clone https://github.com/datahappy1/dummy_file_generator c:\dummy_file_generator\
Set PYTHONPATH to c:\dummy_file_generator\ tutorial

To run:

The CLI tool needs these MANDATORY arguments defining:

projectname --projectname or -pn based on the projectname, the dummy file project specific settings from dummy_file_generator/configs/config.json file are loaded ,
absolutepath --generated_file_path or -gp defining the full output file path to the file you are about to generate

Provided arguments have higher precedence than fallback values in settings.py

The CLI tool can further consume these OPTIONAL arguments defining:

filesize --filesize or -fs defining the desired size (in kBs) of the output file
rowcount --rowcount or -rc defining the desired row count of the output file

Note if you do NOT specify the filesize and do NOT specify the rowcount, the default row_count value from settings.py will be used ( or the value you provide in the default_rowcount optional argument)

The CLI tool also supports these OPTIONAL arguments that can be used to override values in settings.py:

logging_level --logging_level or -ll defining the Python logging level
default_rowcount --default_rowcount or -drc defining the rowcount fallback value when neither row_count,neither file_size set
file_encoding --file_encoding or -fen defining the generated files encoding
file_line_ending --file_line_ending or -fle defining the file line ending

These two OPTIONAL arguments are typically needed when running the tool as an imported package, but you can use them even with this tool running as CLI:

data_files_location --data_files_location or -dfl defining the path to the source .txt data files
config_json_path --config_json_path or -cjp defining the custom path to your config.json file

Example how to run the tool with the -fs argument to set the desired filesize of 256 kB :

cd c:\dummy_file_generator\dummy_file_generator
python c:\dummy_file_generator\dummy_file_generator\__main__.py -pn dummy1 -gp c:\myfiles\dummy1file.csv -fs 256

Example how to run the tool with the -rc argument to set the desired rowcount of 1000 rows :

cd c:\dummy_file_generator\dummy_file_generator
python c:\dummy_file_generator\dummy_file_generator\__main__.py -pn dummy1 -gp c:\myfiles\dummy1file.csv -rc 1000

How to install and run the tool as an imported package

One common usage scenario can be load / stress / performance testing of file-processing data tools, where you can generate dummy text files during the test fixtures / setup.

To install:

pip install dummy-file-generator

You are strongly encouraged to use the Python virtual environment or Pipenv

To run:

The dummy file generator imported package needs these MANDATORY arguments defining:

projectname --projectname or -pn, based on the project name, the dummy file specific settings from config.json file are loaded
generated_file_path --generated_file_path or gp defining the full output file path to the file you are about to generate

Provided arguments have higher precedence than fallback values in settings.py

The dummy file generator imported package can further consume these OPTIONAL arguments defining:

filesize --filesize or -fs defining the desired size (in kBs) of the output file
rowcount --rowcount or -rc defining the desired row count of the output file

Note if you do NOT specify the filesize and do NOT specify the rowcount, the DEFAULT_ROW_COUNT value from settings.py will be used ( you can override the DEFAULT_ROW_COUNT value in settings.py using the default_rowcount optional argument)

data_files_location --data_files_location or -dfl defining the path to the source .txt data files
config_json_path --config_json_path or -cjp defining the custom path to your config.json file
logging_level --logging_level or -ll defining the Python logging level
default_rowcount --default_rowcount or -drc defining the rowcount fallback value when neither row_count,neither file_size set
file_encoding --file_encoding or -fen defining the generated files encoding
file_line_ending --file_line_ending or -fle defining the file line ending

In the example below, project_scope_kwargs arguments project_name, data_files_location, config_json_path and default_rowcount are used to instantiate a DummyFileGenerator class instance. file_scope_kwargs arguments generated_file_path, file_size, file_encoding and file_line_ending are used to setup the generated file properties. Once there is a instance of DummyFileGenerator, you can use it to generate as many files as needed while only using the write_output_file method and it's specific file_scope_kwargs arguments

Example how to run :

from dummy_file_generator import DummyFileGenerator as Dfg, DummyFileGeneratorException

logging_level = "INFO"

project_scope_kwargs = {
    "project_name": "dummy1",
    "data_files_location": "c:\\dfg_files\my_data_files",
    "config_json_path": "c:\\dfg_files\my_configs\config.json",
    "default_rowcount": None,
}

try:
    dfg = Dfg(logging_level, **project_scope_kwargs)
except DummyFileGeneratorException as DFG_ERR:
    raise DFG_ERR

file_scope_kwargs = {
    "generated_file_path": "C:\dfg\\bin\\file1.csv",
    "file_size": 1024,
    #"row_count": 1000, 
    "file_encoding": "utf8",
    "file_line_ending": "\n",
}

try:
    dfg.write_output_file(**file_scope_kwargs)
except DummyFileGeneratorException as DFG_ERR:
    raise DFG_ERR

How to setup a new dummy file generator project

You need to generate dummy files based on the content of the text files in your data_files folder, and these source text files need to have this plain text format:

This tool picks random item from each of the files configured for your project in config.json and uses these values to populate the data for "columns" for each written row.

- How to generate a .csv file

If you need to generate a dummy .csv file containing 3 columns for Names, Dates and IDs, the project JSON object in your config.json would need to be setup like:

{
  "project_name":"dummy1",
  "file_type":"csv",
  "header":true,
  "csv_value_separator": ",",
  "csv_quoting": "ALL",
  "csv_quote_char": "'",
  "csv_escape_char": "\\",
  "columns":[
    {
      "column_name":"Name",
      "datafile":"first_names.txt"
    },
    {
      "column_name":"Date",
      "datafile":"dates.txt"
    },
    {
      "column_name":"ID",
      "datafile":"ids.txt"
    }
  ]
}

This configuration generates a file like this sample:

'Name','Date','ID'
'Hank','2004-05-22','23432'
'Joe','2000-03-12','445'

- How to generate a .txt flat file:

If you need to generate a dummy .txt flat file containing 3 columns for Names, Dates and IDs with specific column lengths defined, the "project" JSON object in your config.json would need to be setup like:

{
  "project_name":"dummy2",
  "file_type":"flat",
  "header":true,
  "columns":[
    {
      "column_name":"Name",
      "column_len":10,
      "datafile":"first_names.txt"
    },
    {
      "column_name":"Date",
      "column_len":12,
      "datafile":"dates.txt"
    },
    {
      "column_name":"ID",
      "column_len":9,
      "datafile":"ids.txt"
    }      
  ]
}

This configuration generates a file like this sample:

Name      Date        ID       
Hank      2004-05-22  23432    
Joe       2000-03-12  445

- How to generate a .json file:

If you need to generate a dummy .json file containing 3 columns for Names, Dates and IDs, the "project" JSON object in your config.json would need to be setup like:

{
  "project_name":"dummy3",
  "file_type":"json",
  "columns":[
    {
      "column_name":"Name",
      "datafile":"first_names.txt"
    },
    {
      "column_name":"Date",
      "datafile":"dates.txt"
    },
    {
      "column_name":"ID",
      "datafile":"ids.txt"
    }      
  ]
}

This configuration generates a file like this sample:

[{"Name": "Hank", "Date": "2004-05-22", "ID": "23432"},
{"Name": "Joe", "Date": "2000-03-12", "ID": "445"}]

If you need to generate a more complex dummy .json file containing 3 columns for Names, Dates, IDs and an array-like column Identifiers containing one IDs array element and an object containing ID1 and ID2 attributes, the "project" JSON object in your config.json would need to be setup like:

{
  "project_name": "dummy4",
  "file_type": "json",
  "columns": [
    {
      "column_name": "Name",
      "datafile": "first_names.txt"
    },
    {
      "column_name": "Date",
      "datafile": "dates.txt"
    },
    {
      "column_name": "ID",
      "datafile": "ids.txt"
    },
    {
      "column_name": "Identifiers",
      "__array_columns": [
        {
          "datafile": "ids.txt"
        },
        {
          "columns": [
            {
              "column_name": "ID1",
              "datafile": "ids.txt"
            },
            {
              "column_name": "ID2",
              "datafile": "ids.txt"
            }
          ]
        }
      ]
    }
  ]
}

This configuration generates a file like this sample:

[{"Name": "Hank", "Date": "2004-05-22", "ID": "23432", "Identifiers": ["445", {"ID1": "11111", "ID2": "145546566345"}]},
{"Name": "Joe", "Date": "2000-03-12", "ID": "445", "Identifiers": ["11111", {"ID1": "145546566345", "ID2": "156765"}]}]

JSON file configuration allows only one level deep nested objects, that have to be defined in the __array_columns array

How to add a new source dataset for your project

Whenever you need to add a new source .txt file in the data_files folder, just add it to your data_files folder. The filename needs to correspond with the datafile value in your config.json file.

If running as a standalone CLI tool, the data_files folder is located here: dummy_file_generator/data_files

When running as an imported package, the data_files folder is where ever you specify it to be using the argument data_files_location.

Now you can use this new data file in your project setup in config.json file.

Developer information

testing using Pytest

Pytest unit and performance tests are also a part of this repository. You can install Pytest using pip install pytest

To run tests:

cd c:\dummy_file_generator\dummy_file_generator
python -m pytest c:\dummy_file_generator\tests ( In case when running from IDE, make sure the current working dir is set to c:\\dummy_file_generator)

Name		Name	Last commit message	Last commit date
Latest commit History 294 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs/img		docs/img
dummy_file_generator		dummy_file_generator
tests		tests
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

dummy_file_generator

version 1.1.21

Dummy .csv, flat text or json files generator written in Python 3.7

How to install and run the tool as CLI

To install:

To run:

Example how to run the tool with the -fs argument to set the desired filesize of 256 kB :

Example how to run the tool with the -rc argument to set the desired rowcount of 1000 rows :

How to install and run the tool as an imported package

To install:

To run:

Example how to run :

How to setup a new dummy file generator project

- How to generate a .csv file

- How to generate a .txt flat file:

- How to generate a .json file:

How to add a new source dataset for your project

Developer information

testing using Pytest

To run tests:

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

datahappy1/dummy_file_generator

Folders and files

Latest commit

History

Repository files navigation

dummy_file_generator

version 1.1.21

Dummy .csv, flat text or json files generator written in Python 3.7

How to install and run the tool as CLI

To install:

To run:

Example how to run the tool with the -fs argument to set the desired filesize of 256 kB :

Example how to run the tool with the -rc argument to set the desired rowcount of 1000 rows :

How to install and run the tool as an imported package

To install:

To run:

Example how to run :

How to setup a new dummy file generator project

- How to generate a .csv file

- How to generate a .txt flat file:

- How to generate a .json file:

How to add a new source dataset for your project

Developer information

testing using Pytest

To run tests:

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages