Skip to content

Commit f61cb22

Browse files
authored
Merge branch 'main' into redo_translation_zh
2 parents c680440 + 6f30e6e commit f61cb22

File tree

15 files changed

+132
-45
lines changed

15 files changed

+132
-45
lines changed

docs/cloud/guides/index.md

Lines changed: 1 addition & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,4 @@ doc_type: 'landing-page'
77
---
88

99
<!--AUTOGENERATED_START-->
10-
| Page | Description |
11-
|-----|-----|
12-
| [Accessing S3 Data Securely](/cloud/security/secure-s3) | This article demonstrates how ClickHouse Cloud customers can leverage role-based access to authenticate with Amazon Simple Storage Service(S3) and access their data securely. |
13-
| [AWS PrivateLink](/manage/security/aws-privatelink) | This document describes how to connect to ClickHouse Cloud using AWS PrivateLink. |
14-
| [Azure Private Link](/cloud/security/azure-privatelink) | How to set up Azure Private Link |
15-
| [Cloud Compatibility](/whats-new/cloud-compatibility) | This guide provides an overview of what to expect functionally and operationally in ClickHouse Cloud. |
16-
| [Cloud IP Addresses](/manage/security/cloud-endpoints-api) | This page documents the Cloud Endpoints API security features within ClickHouse. It details how to secure your ClickHouse deployments by managing access through authentication and authorization mechanisms. |
17-
| [Common Access Management Queries](/cloud/security/common-access-management-queries) | This article shows the basics of defining SQL users and roles and applying those privileges and permissions to databases, tables, rows, and columns. |
18-
| [Configuring organization and service role assignments within the console](/cloud/guides/sql-console/configure-org-service-role-assignments) | Guide showing how to configure org and service role assignments within the console |
19-
| [Configuring SQL console role assignments](/cloud/guides/sql-console/config-sql-console-role-assignments) | Guide showing how to configure SQL console role assignments |
20-
| [Data masking in ClickHouse](/cloud/guides/data-masking) | A guide to data masking in ClickHouse |
21-
| [Gather your connection details](/cloud/guides/sql-console/gather-connection-details) | Gather your connection details |
22-
| [GCP Private Service Connect](/manage/security/gcp-private-service-connect) | This document describes how to connect to ClickHouse Cloud using Google Cloud Platform (GCP) Private Service Connect (PSC), and how to disable access to your ClickHouse Cloud services from addresses other than GCP PSC addresses using ClickHouse Cloud IP access lists. |
23-
| [Inviting new users](/cloud/security/inviting-new-users) | This page describes how administrators can invite new users to their organisation and assign roles to them |
24-
| [Multi tenancy](/cloud/bestpractices/multi-tenancy) | Best practices to implement multi tenancy |
25-
| [SAML SSO Setup](/cloud/security/saml-setup) | How to set up SAML SSO with ClickHouse Cloud |
26-
| [Setting IP Filters](/cloud/security/setting-ip-filters) | This page explains how to set IP filters in ClickHouse Cloud to control access to ClickHouse services. |
27-
| [Usage limits](/cloud/bestpractices/usage-limits) | Describes the recommended usage limits in ClickHouse Cloud |
28-
<!--AUTOGENERATED_END-->
10+
<!--AUTOGENERATED_END-->

docs/deployment-guides/replication-sharding-examples/02_2_shards_1_replica.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -557,11 +557,7 @@ SHOW DATABASES;
557557

558558
## Create a table on the cluster {#creating-a-table}
559559

560-
Now that the database has been created, you will create a distributed table.
561-
Distributed tables are tables which have access to shards located on different
562-
hosts and are defined using the `Distributed` table engine. The distributed table
563-
acts as the interface across all the shards in the cluster.
564-
560+
Now that the database has been created, you will create a table.
565561
Run the following query from any of the host clients:
566562

567563
```sql
@@ -608,8 +604,6 @@ SHOW TABLES IN uk;
608604
└─────────────────────┘
609605
```
610606

611-
## Insert data into a distributed table {#inserting-data}
612-
613607
Before we insert the UK price paid data, let's perform a quick experiment to see
614608
what happens when we insert data into an ordinary table from either host.
615609

@@ -622,7 +616,7 @@ CREATE TABLE test.test_table ON CLUSTER cluster_2S_1R
622616
`id` UInt64,
623617
`name` String
624618
)
625-
ENGINE = ReplicatedMergeTree
619+
ENGINE = MergeTree()
626620
ORDER BY id;
627621
```
628622

@@ -654,16 +648,18 @@ SELECT * FROM test.test_table;
654648
-- └────┴────────────────────┘
655649
```
656650

657-
You will notice that only the row that was inserted into the table on that
651+
You will notice that unlike with a `ReplicatedMergeTree` table only the row that was inserted into the table on that
658652
particular host is returned and not both rows.
659653

660-
To read the data from the two shards we need an interface which can handle queries
654+
To read the data across the two shards, we need an interface which can handle queries
661655
across all the shards, combining the data from both shards when we run select queries
662-
on it, and handling the insertion of data to the separate shards when we run insert queries.
656+
on it or inserting data to both shards when we run insert queries.
663657

664-
In ClickHouse this interface is called a distributed table, which we create using
658+
In ClickHouse this interface is called a **distributed table**, which we create using
665659
the [`Distributed`](/engines/table-engines/special/distributed) table engine. Let's take a look at how it works.
666660

661+
## Create a distributed table {#create-distributed-table}
662+
667663
Create a distributed table with the following query:
668664

669665
```sql
@@ -674,8 +670,12 @@ ENGINE = Distributed('cluster_2S_1R', 'test', 'test_table', rand())
674670
In this example, the `rand()` function is chosen as the sharding key so that
675671
inserts are randomly distributed across the shards.
676672

677-
Now query the distributed table from either host and you will get back
678-
both of the rows which were inserted on the two hosts:
673+
Now query the distributed table from either host, and you will get back
674+
both of the rows which were inserted on the two hosts, unlike in our previous example:
675+
676+
```sql
677+
SELECT * FROM test.test_table_dist;
678+
```
679679

680680
```sql
681681
┌─id─┬─name───────────────┐
@@ -694,6 +694,8 @@ ON CLUSTER cluster_2S_1R
694694
ENGINE = Distributed('cluster_2S_1R', 'uk', 'uk_price_paid_local', rand());
695695
```
696696

697+
## Insert data into a distributed table {#inserting-data-into-distributed-table}
698+
697699
Now connect to either of the hosts and insert the data:
698700

699701
```sql

docs/deployment-guides/replication-sharding-examples/03_2_shards_2_replicas.md

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -586,12 +586,9 @@ SHOW DATABASES;
586586
└────────────────────┘
587587
```
588588

589-
## Create a distributed table on the cluster {#creating-a-table}
589+
## Create a table on the cluster {#creating-a-table}
590590

591-
Now that the database has been created, next you will create a distributed table.
592-
Distributed tables are tables which have access to shards located on different
593-
hosts and are defined using the `Distributed` table engine. The distributed table
594-
acts as the interface across all the shards in the cluster.
591+
Now that the database has been created, next you will create a table with replication.
595592

596593
Run the following query from any of the host clients:
597594

@@ -663,14 +660,16 @@ SHOW TABLES IN uk;
663660

664661
## Insert data into a distributed table {#inserting-data-using-distributed}
665662

666-
To insert data into the distributed table, `ON CLUSTER` cannot be used as it does
663+
To insert data into the table, `ON CLUSTER` cannot be used as it does
667664
not apply to DML (Data Manipulation Language) queries such as `INSERT`, `UPDATE`,
668665
and `DELETE`. To insert data, it is necessary to make use of the
669666
[`Distributed`](/engines/table-engines/special/distributed) table engine.
667+
As you learned in the [guide](/architecture/horizontal-scaling) for setting up a cluster with 2 shards and 1 replica, distributed tables are tables which have access to shards located on different
668+
hosts and are defined using the `Distributed` table engine.
669+
The distributed table acts as the interface across all the shards in the cluster.
670670

671671
From any of the host clients, run the following query to create a distributed table
672-
using the existing table we created previously with `ON CLUSTER` and use of the
673-
`ReplicatedMergeTree`:
672+
using the existing replicated table we created in the previous step:
674673

675674
```sql
676675
CREATE TABLE IF NOT EXISTS uk.uk_price_paid_distributed
@@ -749,4 +748,16 @@ SELECT count(*) FROM uk.uk_price_paid_local;
749748
└──────────┘
750749
```
751750

752-
</VerticalStepper>
751+
</VerticalStepper>
752+
753+
## Conclusion {#conclusion}
754+
755+
The advantage of this cluster topology with 2 shards and 2 replicas is that it provides both scalability and fault tolerance.
756+
Data is distributed across separate hosts, reducing storage and I/O requirements per node, while queries are processed in parallel across both shards for improved performance and memory efficiency.
757+
Critically, the cluster can tolerate the loss of one node and continue serving queries without interruption, as each shard has a backup replica available on another node.
758+
759+
The main disadvantage of this cluster topology is the increased storage overhead—it requires twice the storage capacity compared to a setup without replicas, as each shard is duplicated.
760+
Additionally, while the cluster can survive a single node failure, losing two nodes simultaneously may render the cluster inoperable, depending on which nodes fail and how shards are distributed.
761+
This topology strikes a balance between availability and cost, making it suitable for production environments where some level of fault tolerance is required without the expense of higher replication factors.
762+
763+
To learn how ClickHouse Cloud processes queries, offering both scalability and fault-tolerance, see the section ["Parallel Replicas"](/deployment-guides/parallel-replicas).

docs/deployment-guides/replication-sharding-examples/_snippets/_working_example.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,5 @@
22
The following steps will walk you through setting up the cluster from
33
scratch. If you prefer to skip these steps and jump straight to running the
44
cluster, you can obtain the example
5-
files from the [examples repository](https://github.com/ClickHouse/examples/tree/main/docker-compose-recipes)
5+
files from the examples repository ['docker-compose-recipes' directory](https://github.com/ClickHouse/examples/tree/main/docker-compose-recipes/recipes).
66
:::

docs/integrations/data-ingestion/clickpipes/postgres/faq.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -356,3 +356,7 @@ Yes, for a Postgres ClickPipe with replication mode as CDC or Snapshot + CDC, yo
356356
<Image img={failover_slot} border size="md"/>
357357
358358
If the source is configured accordingly, the slot is preserved after failovers to a Postgres read replica, ensuring continuous data replication. Learn more [here](https://www.postgresql.org/docs/current/logical-replication-failover.html).
359+
360+
### I am seeing errors like `Internal error encountered during logical decoding of aborted sub-transaction` {#transient-logical-decoding-errors}
361+
362+
This error suggests a transient issue with the logical decoding of aborted sub-transaction, and is specific to custom implementations of Aurora Postgres. Given the error is coming from `ReorderBufferPreserveLastSpilledSnapshot` routine, this suggests that logical decoding is not able to read the snapshot spilled to disk. It may be worth trying to increase [`logical_decoding_work_mem`](https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-LOGICAL-DECODING-WORK-MEM) to a higher value.
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
sidebar_label: 'Dot'
3+
slug: /integrations/dot
4+
keywords: ['clickhouse', 'dot', 'ai', 'chatbot', 'mysql', 'integrate', 'ui', 'virtual assistant']
5+
description: 'AI Chatbot | Dot is an intelligent virtual data assistant that answers business data questions, retrieves definitions and relevant data assets, and can even assist with data modelling, powered by ClickHouse.'
6+
title: 'Dot'
7+
doc_type: 'guide'
8+
---
9+
10+
import Image from '@theme/IdealImage';
11+
import dot_01 from '@site/static/images/integrations/data-visualization/dot_01.png';
12+
import dot_02 from '@site/static/images/integrations/data-visualization/dot_02.png';
13+
import CommunityMaintainedBadge from '@theme/badges/CommunityMaintained';
14+
15+
# Dot
16+
17+
<CommunityMaintainedBadge/>
18+
19+
[Dot](https://www.getdot.ai/) is your **AI Data Analyst**.
20+
It connects directly to ClickHouse so you can ask data questions in natural language, discover data, test hypotheses, and answer why questions — directly in Slack, Microsoft Teams, ChatGPT or the native Web UI.
21+
22+
## Pre-requisites {#pre-requisites}
23+
24+
- A ClickHouse database, either self-hosted or in [ClickHouse Cloud](https://clickhouse.com/cloud)
25+
- A [Dot](https://www.getdot.ai/) account
26+
- A [Hashboard](https://www.hashboard.com/) account and project.
27+
28+
## Connecting Dot to ClickHouse {#connecting-dot-to-clickhouse}
29+
30+
<Image size="md" img={dot_01} alt="Configuring ClickHouse connection in Dot (light mode)" border />
31+
<br/>
32+
33+
1. In the Dot UI, go to **Settings → Connections**.
34+
2. Click on **Add new connection** and select **ClickHouse**.
35+
3. Provide your connection details:
36+
- **Host**: ClickHouse server hostname or ClickHouse Cloud endpoint
37+
- **Port**: `9440` (secure native interface) or `9000` (default TCP)
38+
- **Username / Password**: user with read access
39+
- **Database**: optionally set a default schema
40+
4. Click **Connect**.
41+
42+
<Image img={dot_02} alt="Connecting ClickHouse" size="sm"/>
43+
44+
Dot uses **query-pushdown**: ClickHouse handles the heavy number-crunching at scale, while Dot ensures correct and trusted answers.
45+
46+
## Highlights {#highlights}
47+
48+
Dot makes data accessible through conversation:
49+
50+
- **Ask in natural language**: Get answers without writing SQL.
51+
- **Why analysis**: Ask follow-up questions to understand trends and anomalies.
52+
- **Works where you work**: Slack, Microsoft Teams, ChatGPT, or the web app.
53+
- **Trusted results**: Dot validates queries against your schemas and definitions to minimize errors.
54+
- **Scalable**: Built on query-pushdown, pairing Dot’s intelligence with ClickHouse’s speed.
55+
56+
## Security and governance {#security}
57+
58+
Dot is enterprise-ready:
59+
60+
- **Permissions & roles**: Inherits ClickHouse user access controls
61+
- **Row-level security**: Supported if configured in ClickHouse
62+
- **TLS / SSL**: Enabled by default for ClickHouse Cloud; configure manually for self-hosted
63+
- **Governance & validation**: Training/validation space helps prevent hallucinations
64+
- **Compliance**: SOC 2 Type I certified
65+
66+
## Additional resources {#additional-resources}
67+
68+
- Dot website: [https://www.getdot.ai/](https://www.getdot.ai/)
69+
- Documentation: [https://docs.getdot.ai/](https://docs.getdot.ai/)
70+
- Dot app: [https://app.getdot.ai/](https://app.getdot.ai/)
71+
72+
Now you can use **ClickHouse + Dot** to analyze your data conversationally — combining Dot’s AI assistant with ClickHouse’s fast, scalable analytics engine.

docs/integrations/data-visualization/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Now that your data is in ClickHouse, it's time to analyze it, which often involv
2929
- [Astrato](./astrato-and-clickhouse.md)
3030
- [Chartbrew](./chartbrew-and-clickhouse.md)
3131
- [Deepnote](./deepnote.md)
32+
- [Dot](./dot-and-clickhouse.md)
3233
- [Draxlr](./draxlr-and-clickhouse.md)
3334
- [Embeddable](./embeddable-and-clickhouse.md)
3435
- [Explo](./explo-and-clickhouse.md)
@@ -53,6 +54,7 @@ Now that your data is in ClickHouse, it's time to analyze it, which often involv
5354
| [AWS QuickSight](./quicksight-and-clickhouse.md) | MySQL interface ||| Works with some limitations, see [the documentation](./quicksight-and-clickhouse.md) for more details |
5455
| [Chartbrew](./chartbrew-and-clickhouse.md) | ClickHouse official connector ||| |
5556
| [Deepnote](./deepnote.md) | Native connector ||| |
57+
| [Dot](./dot-and-clickhouse.md) | Native connector ||| |
5658
| [Explo](./explo-and-clickhouse.md) | Native connector ||| |
5759
| [Fabi.ai](./fabi-and-clickhouse.md) | Native connector ||| |
5860
| [Grafana](./grafana/index.md) | ClickHouse official connector ||| |

0 commit comments

Comments
 (0)