Skip to content

Commit e84be44

Browse files
authored
Merge pull request #4455 from Blargian/generate_JSON_functions
Functions: start generating JSON functions from system tables
2 parents d25295f + 60789ef commit e84be44

File tree

5 files changed

+7
-3
lines changed

5 files changed

+7
-3
lines changed

docs/best-practices/json_type.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ ORDER BY doc.update_date
224224
We provide a type hint for the `update_date` column in the JSON definition, as we use it in the ordering/primary key. This helps ClickHouse to know that this column won't be null and ensures it knows which `update_date` sub-column to use (there may be multiple for each type, so this is ambiguous otherwise).
225225
:::
226226

227-
We can insert into this table and view the subsequently inferred schema using the [`JSONAllPathsWithTypes`](/sql-reference/functions/json-functions#jsonallpathswithtypes) function and [`PrettyJSONEachRow`](/interfaces/formats/PrettyJSONEachRow) output format:
227+
We can insert into this table and view the subsequently inferred schema using the [`JSONAllPathsWithTypes`](/sql-reference/functions/json-functions#JSONAllPathsWithTypes) function and [`PrettyJSONEachRow`](/interfaces/formats/PrettyJSONEachRow) output format:
228228

229229
```sql
230230
INSERT INTO arxiv FORMAT JSONAsObject

docs/cloud/onboard/02_migrate/01_migration_guides/04_snowflake/02_migration_guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ input_format_parquet_case_insensitive_column_matching = 1 -- Column matching bet
101101
:::note Note on nested column structures
102102
The `VARIANT` and `OBJECT` columns in the original Snowflake table schema will be output as JSON strings by default, forcing us to cast these when inserting them into ClickHouse.
103103

104-
Nested structures such as `some_file` are converted to JSON strings on copy by Snowflake. Importing this data requires us to transform these structures to Tuples at insert time in ClickHouse, using the [JSONExtract function](/sql-reference/functions/json-functions#jsonextract) as shown above.
104+
Nested structures such as `some_file` are converted to JSON strings on copy by Snowflake. Importing this data requires us to transform these structures to Tuples at insert time in ClickHouse, using the [JSONExtract function](/sql-reference/functions/json-functions#JSONExtract) as shown above.
105105
:::
106106

107107
## Test successful data export {#3-testing-successful-data-export}

docs/getting-started/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ by https://github.com/ClickHouse/clickhouse-docs/blob/main/scripts/autogenerate-
4242
| [Foursquare places](/getting-started/example-datasets/foursquare-places) | Dataset with over 100 million records containing information about places on a map, such as shops, restaurants, parks, playgrounds, and monuments. |
4343
| [GitHub Events Dataset](/getting-started/example-datasets/github-events) | Dataset containing all events on GitHub from 2011 to Dec 6 2020, with a size of 3.1 billion records. |
4444
| [Hacker News dataset](/getting-started/example-datasets/hacker-news) | Dataset containing 28 million rows of hacker news data. |
45+
| [Hacker News Vector Search dataset](/getting-started/example-datasets/hackernews-vector-search-dataset) | Dataset containing 28+ million Hacker News postings & their vector embeddings |
4546
| [LAION 5B dataset](/getting-started/example-datasets/laion-5b-dataset) | Dataset containing 100 million vectors from the LAION 5B dataset |
4647
| [Laion-400M dataset](/getting-started/example-datasets/laion-400m-dataset) | Dataset containing 400 million images with English image captions |
4748
| [New York Public Library "What's on the Menu?" Dataset](/getting-started/example-datasets/menus) | Dataset containing 1.3 million records of historical data on the menus of hotels, restaurants and cafes with the dishes along with their prices. |

docs/integrations/data-ingestion/data-formats/json/other.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ SELECT JSONExtractString(tags, 'holidays') AS holidays FROM people
7070
1 row in set. Elapsed: 0.002 sec.
7171
```
7272

73-
Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [`JSON_QUERY`](/sql-reference/functions/json-functions#json_query) and [`JSON_VALUE`](/sql-reference/functions/json-functions#json_value).
73+
Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [`JSON_QUERY`](/sql-reference/functions/json-functions#JSON_QUERY) and [`JSON_VALUE`](/sql-reference/functions/json-functions#json_value).
7474

7575
Consider the extreme case with the `arxiv` dataset where we consider the entire body to be a `String`.
7676

scripts/settings/autogenerate-settings.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,7 @@ if [ -f "$FUNCTION_SQL_FILE" ]; then
266266
"Encryption"
267267
"Hash"
268268
"Introspection"
269+
"JSON"
269270
)
270271

271272
for CATEGORY in "${FUNCTION_CATEGORIES[@]}"; do
@@ -376,6 +377,7 @@ insert_src_files=(
376377
"encryption-functions.md"
377378
"hash-functions.md"
378379
"introspection-functions.md"
380+
"json-functions.md"
379381
)
380382

381383
insert_dest_files=(
@@ -394,6 +396,7 @@ insert_dest_files=(
394396
"docs/sql-reference/functions/encryption-functions.md"
395397
"docs/sql-reference/functions/hash-functions.md"
396398
"docs/sql-reference/functions/introspection.md"
399+
"docs/sql-reference/functions/json-functions.md"
397400
)
398401

399402
echo "[$SCRIPT_NAME] Inserting generated markdown content between AUTOGENERATED_START and AUTOGENERATED_END tags"

0 commit comments

Comments
 (0)