Skip to content

Commit 615c56b

Browse files
Feature generate from existing data (#163)
* wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip
1 parent 37f0c31 commit 615c56b

File tree

11 files changed

+694
-104
lines changed

11 files changed

+694
-104
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ All notable changes to the Databricks Labs Data Generator will be documented in
1212
* Additional build ordering enhancements to reduce circumstances where explicit base column must be specified
1313

1414
#### Added
15+
* Scripting of data generation code from schema (Experimental)
16+
* Scripting of data generation code from dataframe (Experimental)
1517
* Added top level `random` attribute to data generator specification constructor
1618

1719

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ used in other computations
5151
* use of SQL expressions in synthetic data generation
5252
* plugin mechanism to allow use of 3rd party libraries such as Faker
5353
* Use within a Databricks Delta Live Tables pipeline as a synthetic data generation source
54+
* Generate synthetic data generation code from existing schema or data (experimental)
5455

5556
Details of these features can be found in the online documentation -
5657
[online documentation](https://databrickslabs.github.io/dbldatagen/public_docs/index.html).

dbldatagen/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
from .datagen_constants import DEFAULT_RANDOM_SEED, RANDOM_SEED_RANDOM, RANDOM_SEED_FIXED, \
2828
RANDOM_SEED_HASH_FIELD_NAME, MIN_PYTHON_VERSION, MIN_SPARK_VERSION
2929
from .utils import ensure, topologicalSort, mkBoundsList, coalesce_values, \
30-
deprecated, parse_time_interval, DataGenError, split_list_matching_condition
30+
deprecated, parse_time_interval, DataGenError, split_list_matching_condition, strip_margins
3131
from ._version import __version__
3232
from .column_generation_spec import ColumnGenerationSpec
3333
from .column_spec_options import ColumnSpecOptions

0 commit comments

Comments
 (0)