Skip to content

Commit c4b256c

Browse files
committed
Merge branch 'main' of https://github.com/ClickHouse/clickhouse-docs into diataxis-llm
2 parents 9c23ab9 + d7196a4 commit c4b256c

File tree

20 files changed

+149
-6
lines changed

20 files changed

+149
-6
lines changed

docs/best-practices/json_type.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ ORDER BY doc.update_date
225225
We provide a type hint for the `update_date` column in the JSON definition, as we use it in the ordering/primary key. This helps ClickHouse to know that this column won't be null and ensures it knows which `update_date` sub-column to use (there may be multiple for each type, so this is ambiguous otherwise).
226226
:::
227227

228-
We can insert into this table and view the subsequently inferred schema using the [`JSONAllPathsWithTypes`](/sql-reference/functions/json-functions#jsonallpathswithtypes) function and [`PrettyJSONEachRow`](/interfaces/formats/PrettyJSONEachRow) output format:
228+
We can insert into this table and view the subsequently inferred schema using the [`JSONAllPathsWithTypes`](/sql-reference/functions/json-functions#JSONAllPathsWithTypes) function and [`PrettyJSONEachRow`](/interfaces/formats/PrettyJSONEachRow) output format:
229229

230230
```sql
231231
INSERT INTO arxiv FORMAT JSONAsObject

docs/cloud/onboard/02_migrate/01_migration_guides/04_snowflake/02_migration_guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ input_format_parquet_case_insensitive_column_matching = 1 -- Column matching bet
102102
:::note Note on nested column structures
103103
The `VARIANT` and `OBJECT` columns in the original Snowflake table schema will be output as JSON strings by default, forcing us to cast these when inserting them into ClickHouse.
104104

105-
Nested structures such as `some_file` are converted to JSON strings on copy by Snowflake. Importing this data requires us to transform these structures to Tuples at insert time in ClickHouse, using the [JSONExtract function](/sql-reference/functions/json-functions#jsonextract) as shown above.
105+
Nested structures such as `some_file` are converted to JSON strings on copy by Snowflake. Importing this data requires us to transform these structures to Tuples at insert time in ClickHouse, using the [JSONExtract function](/sql-reference/functions/json-functions#JSONExtract) as shown above.
106106
:::
107107

108108
## Test successful data export {#3-testing-successful-data-export}

docs/getting-started/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ by https://github.com/ClickHouse/clickhouse-docs/blob/main/scripts/autogenerate-
4343
| [Foursquare places](/getting-started/example-datasets/foursquare-places) | Dataset with over 100 million records containing information about places on a map, such as shops, restaurants, parks, playgrounds, and monuments. |
4444
| [GitHub Events Dataset](/getting-started/example-datasets/github-events) | Dataset containing all events on GitHub from 2011 to Dec 6 2020, with a size of 3.1 billion records. |
4545
| [Hacker News dataset](/getting-started/example-datasets/hacker-news) | Dataset containing 28 million rows of hacker news data. |
46+
| [Hacker News Vector Search dataset](/getting-started/example-datasets/hackernews-vector-search-dataset) | Dataset containing 28+ million Hacker News postings & their vector embeddings |
4647
| [LAION 5B dataset](/getting-started/example-datasets/laion-5b-dataset) | Dataset containing 100 million vectors from the LAION 5B dataset |
4748
| [Laion-400M dataset](/getting-started/example-datasets/laion-400m-dataset) | Dataset containing 400 million images with English image captions |
4849
| [New York Public Library "What's on the Menu?" Dataset](/getting-started/example-datasets/menus) | Dataset containing 1.3 million records of historical data on the menus of hotels, restaurants and cafes with the dishes along with their prices. |

docs/integrations/data-ingestion/clickpipes/index.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ import Mongodbsvg from '@site/static/images/integrations/logos/mongodb.svg';
2222
import redpanda_logo from '@site/static/images/integrations/logos/logo_redpanda.png';
2323
import clickpipes_stack from '@site/static/images/integrations/data-ingestion/clickpipes/clickpipes_stack.png';
2424
import cp_custom_role from '@site/static/images/integrations/data-ingestion/clickpipes/cp_custom_role.png';
25+
import cp_advanced_settings from '@site/static/images/integrations/data-ingestion/clickpipes/cp_advanced_settings.png';
2526
import Image from '@theme/IdealImage';
2627

2728
# Integrating with ClickHouse Cloud
@@ -82,6 +83,14 @@ Steps:
8283

8384
<Image img={cp_custom_role} alt="Assign a custom role" size="lg" border/>
8485

86+
## Adjusting ClickPipes advanced settings {#clickpipes-advanced-settings}
87+
ClickPipes provides sensible defaults that cover the requirements of most use cases. If your use case requires additional fine-tuning, you can adjust the following settings:
88+
89+
- **Streaming max insert wait time**: Configures the maximum wait period before inserting data into the ClickHouse cluster. Applies to streaming ClickPipes (e.g., Kafka, Kinesis).
90+
- **Object storage polling interval**: Configures how frequently ClickPipes checks object storage for new data. Applies to object storage ClickPipes (e.g., S3, GCS).
91+
92+
<Image img={cp_advanced_settings} alt="Advanced settings for ClickPipes" size="lg" border/>
93+
8594
## Error reporting {#error-reporting}
8695
ClickPipes will store errors in two separate tables depending on the type of error encountered during the ingestion process.
8796
### Record Errors {#record-errors}

docs/integrations/data-ingestion/clickpipes/postgres/faq.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ title: 'ClickPipes for Postgres FAQ'
77
doc_type: 'reference'
88
---
99

10+
import failover_slot from '@site/static/images/integrations/data-ingestion/clickpipes/postgres/failover_slot.png'
11+
import Image from '@theme/IdealImage';
12+
1013
# ClickPipes for Postgres FAQ
1114

1215
### How does idling affect my Postgres CDC ClickPipe? {#how-does-idling-affect-my-postgres-cdc-clickpipe}
@@ -33,7 +36,7 @@ To set the replica identity to FULL, you can use the following SQL command:
3336
```sql
3437
ALTER TABLE your_table_name REPLICA IDENTITY FULL;
3538
```
36-
REPLICA IDENTITY FULL also enabled replication of unchanged TOAST columns. More on that [here](./toast).
39+
REPLICA IDENTITY FULL also enables replication of unchanged TOAST columns. More on that [here](./toast).
3740

3841
Note that using `REPLICA IDENTITY FULL` can have performance implications and also faster WAL growth, especially for tables without a primary key and with frequent updates or deletes, as it requires more data to be logged for each change. If you have any doubts or need assistance with setting up primary keys or replica identities for your tables, please reach out to our support team for guidance.
3942

@@ -346,3 +349,10 @@ If your initial load has completed without error but your destination ClickHouse
346349
Also worth checking:
347350
- If the user has sufficient permissions to read the source tables.
348351
- If there are any row policies on ClickHouse side which might be filtering out rows.
352+
353+
### Can I have the ClickPipe create a replication slot with failover enabled? {#failover-slot}
354+
Yes, for a Postgres ClickPipe with replication mode as CDC or Snapshot + CDC, you can have ClickPipes create a replication slot with failover enabled, by toggling the below switch in the `Advanced Settings` section while creating the ClickPipe. Note that your Postgres version must be 17 or above to use this feature.
355+
356+
<Image img={failover_slot} border size="md"/>
357+
358+
If the source is configured accordingly, the slot is preserved after failovers to a Postgres read replica, ensuring continuous data replication. Learn more [here](https://www.postgresql.org/docs/current/logical-replication-failover.html).

docs/integrations/data-ingestion/clickpipes/postgres/schema-changes.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ ClickPipes for Postgres can detect schema changes in the source tables and, in s
1111

1212
| Schema Change Type | Behaviour |
1313
| ----------------------------------------------------------------------------------- | ------------------------------------- |
14-
| Adding a new column (`ALTER TABLE ADD COLUMN ...`) | Propagated automatically. The new column(s) will be populated for all rows replicated after the schema change |
15-
| Adding a new column with a default value (`ALTER TABLE ADD COLUMN ... DEFAULT ...`) | Propagated automatically. The new column(s) will be populated for all rows replicated after the schema change, but existing rows will not show the default value without a full table refresh |
14+
| Adding a new column (`ALTER TABLE ADD COLUMN ...`) | Propagated automatically once the table gets an insert/update/delete. The new column(s) will be populated for all rows replicated after the schema change |
15+
| Adding a new column with a default value (`ALTER TABLE ADD COLUMN ... DEFAULT ...`) | Propagated automatically once the table gets an insert/update/delete. The new column(s) will be populated for all rows replicated after the schema change, but existing rows will not show the default value without a full table refresh |
1616
| Dropping an existing column (`ALTER TABLE DROP COLUMN ...`) | Detected, but **not** propagated. The dropped column(s) will be populated with `NULL` for all rows replicated after the schema change |
17+
18+
Note that column addition will be propagated at the end of a batch's sync, which could occur after the sync interval or pull batch size is reached. More information on controlling syncs [here](./controlling_sync.md)

docs/integrations/data-ingestion/data-formats/json/other.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ SELECT JSONExtractString(tags, 'holidays') AS holidays FROM people
7171
1 row in set. Elapsed: 0.002 sec.
7272
```
7373

74-
Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [`JSON_QUERY`](/sql-reference/functions/json-functions#json_query) and [`JSON_VALUE`](/sql-reference/functions/json-functions#json_value).
74+
Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [`JSON_QUERY`](/sql-reference/functions/json-functions#JSON_QUERY) and [`JSON_VALUE`](/sql-reference/functions/json-functions#json_value).
7575

7676
Consider the extreme case with the `arxiv` dataset where we consider the entire body to be a `String`.
7777

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
---
2+
slug: /use-cases/AI_ML/AIChat
3+
sidebar_label: 'AI Chat'
4+
title: 'Using AI Chat in ClickHouse Cloud'
5+
pagination_prev: null
6+
pagination_next: null
7+
description: 'Guide to enabling and using the AI Chat feature in ClickHouse Cloud Console'
8+
keywords: ['AI', 'ClickHouse Cloud', 'Chat', 'SQL Console', 'Agent', 'Docs AI']
9+
show_related_blogs: true
10+
sidebar_position: 2
11+
---
12+
13+
import Link from '@docusaurus/Link';
14+
import Image from '@theme/IdealImage';
15+
import img_open from '@site/static/images/use-cases/AI_ML/AIChat/1_open_chat.png';
16+
import img_consent from '@site/static/images/use-cases/AI_ML/AIChat/2_consent.png';
17+
import img_modes from '@site/static/images/use-cases/AI_ML/AIChat/3_modes.png';
18+
import img_thinking from '@site/static/images/use-cases/AI_ML/AIChat/4_thinking.png';
19+
import img_history from '@site/static/images/use-cases/AI_ML/AIChat/5_history.png';
20+
import img_result_actions from '@site/static/images/use-cases/AI_ML/AIChat/6_result_actions.png';
21+
import img_new_tab from '@site/static/images/use-cases/AI_ML/AIChat/7_open_in_editor.png';
22+
23+
# Using AI Chat in ClickHouse Cloud
24+
25+
> This guide explains how to enable and use the AI Chat feature in the ClickHouse Cloud Console.
26+
27+
<VerticalStepper headerLevel="h2">
28+
29+
## Prerequisites {#prerequisites}
30+
31+
1. You must have access to a ClickHouse Cloud organization with AI features enabled (contact your org admin or support if unavailable).
32+
33+
## Open the AI Chat panel {#open-panel}
34+
35+
1. Navigate to a ClickHouse Cloud service.
36+
2. In the left sidebar, click the sparkle icon labeled “Ask AI”.
37+
3. (Shortcut) Press <kbd>⌘</kbd> + <kbd>'</kbd> (macOS) or <kbd>Ctrl</kbd> + <kbd>'</kbd> (Linux/Windows) to toggle open.
38+
39+
<Image img={img_open} alt="Open AI Chat flyout" size="md"/>
40+
41+
## Accept the data usage consent (first run) {#consent}
42+
43+
1. On first use you are prompted with a consent dialog describing data handling and third‑party LLM sub-processors.
44+
2. Review and accept to proceed. If you decline, the panel will not open.
45+
46+
<Image img={img_consent} alt="Consent dialog" size="md"/>
47+
48+
## Choose a chat mode {#modes}
49+
50+
AI Chat currently supports:
51+
52+
- **Agent**: Multi‑step reasoning over schema + metadata (service must be awake).
53+
- **Docs AI (Ask)**: Focused Q&A grounded in official ClickHouse documentation and best‑practice references.
54+
55+
Use the mode selector at the bottom-left of the flyout to switch.
56+
57+
<Image img={img_modes} alt="Mode selection" size="sm"/>
58+
59+
## Compose and send a message {#compose}
60+
61+
1. Type your question (e.g. “Create a materialized view to aggregate daily events by user”).
62+
2. Press <kbd>Enter</kbd> to send (use <kbd>Shift</kbd> + <kbd>Enter</kbd> for a newline).
63+
3. While the model is processing you can click “Stop” to interrupt.
64+
65+
## Understanding “Agent” thinking steps {#thinking-steps}
66+
67+
In Agent mode you may see expandable intermediate “thinking” or planning steps. These provide transparency into how the assistant forms its answer. Collapse or expand as needed.
68+
69+
<Image img={img_thinking} alt="Thinking steps" size="md"/>
70+
71+
## Starting new chats {#new-chats}
72+
73+
Click the “New Chat” button to clear context and begin a fresh session.
74+
75+
## Viewing chat history {#history}
76+
77+
1. The lower section lists your recent chats.
78+
2. Select a previous chat to load its messages.
79+
3. Delete a conversation using the trash icon.
80+
81+
<Image img={img_history} alt="Chat history list" size="md"/>
82+
83+
## Working with generated SQL {#sql-actions}
84+
85+
When the assistant returns SQL:
86+
87+
- Review for correctness.
88+
- Click “Open in editor” to load the query into a new SQL tab.
89+
- Modify and execute within the Console.
90+
91+
<Image img={img_result_actions} alt="Result actions" size="md"/>
92+
93+
<Image img={img_new_tab} alt="Open generated query in editor" size="md"/>
94+
95+
## Stopping or interrupting a response {#interrupt}
96+
97+
If a response is taking too long or diverging:
98+
99+
1. Click the “Stop” button (visible while processing).
100+
2. The message is marked as interrupted; you can refine your prompt and resend.
101+
102+
## Keyboard shortcuts {#shortcuts}
103+
104+
| Action | Shortcut |
105+
| ------ | -------- |
106+
| Open AI Chat | `⌘ + '` / `Ctrl + '` |
107+
| Send message | `Enter` |
108+
| New line | `Shift + Enter` |
109+
110+
</VerticalStepper>

scripts/aspell-dict-file.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,7 @@ autovacuum
271271
VACUUM
272272
resync
273273
Resync
274+
failovers
274275
--docs/integrations/data-ingestion/clickpipes/mysql/faq.md--
275276
PlanetScale
276277
Planetscale

scripts/aspell-ignore/en/aspell-dict.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3035,6 +3035,7 @@ resultset
30353035
resync
30363036
resynchronization
30373037
resyncing
3038+
failovers
30383039
retentions
30393040
rethrow
30403041
retransmit

0 commit comments

Comments
 (0)