Skip to content

Commit be0bebe

Browse files
authored
Merge pull request #4515 from ClickHouse/joe/update-chconnect-docs-add-time-time64
Joe/update chconnect docs add time time64
2 parents 330f4f6 + 7db25dc commit be0bebe

File tree

13 files changed

+4308
-1046
lines changed

13 files changed

+4308
-1046
lines changed

docs/cloud/reference/01_changelog/01_changelog.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1292,7 +1292,7 @@ This release brings an officially supported Metabase integration, a major Java c
12921292
- [Metabase](/integrations/data-visualization/metabase-and-clickhouse.md) plugin: Became an official solution maintained by ClickHouse
12931293
- [dbt](/integrations/data-ingestion/etl-tools/dbt/index.md) plugin: Added support for [multiple threads](https://github.com/ClickHouse/dbt-clickhouse/blob/main/CHANGELOG.md)
12941294
- [Grafana](/integrations/data-visualization/grafana/index.md) plugin: Better handling of connection errors
1295-
- [Python](/integrations/language-clients/python/index.md) client: [Streaming support](/integrations/language-clients/python/index.md#streaming-queries) for insert operation
1295+
- [Python](/integrations/language-clients/python/index.md) client: [Streaming support](/integrations/language-clients/python/advanced-querying.md#streaming-queries) for insert operation
12961296
- [Go](/integrations/language-clients/go/index.md) client: [Bug fixes](https://github.com/ClickHouse/clickhouse-go/blob/main/CHANGELOG.md): close canceled connections, better handling of connection errors
12971297
- [JS](/integrations/language-clients/js.md) client: [Breaking changes in exec/insert](https://github.com/ClickHouse/clickhouse-js/releases/tag/0.0.12); exposed query_id in the return types
12981298
- [Java](https://github.com/ClickHouse/clickhouse-java#readme) client / JDBC driver major release
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
sidebar_label: 'Additional Options'
3+
sidebar_position: 3
4+
keywords: ['clickhouse', 'python', 'options', 'settings']
5+
description: 'Additional Options for ClickHouse Connect'
6+
slug: /integrations/language-clients/python/additional-options
7+
title: 'Additional Options'
8+
doc_type: 'reference'
9+
---
10+
11+
# Additional options {#additional-options}
12+
13+
ClickHouse Connect provides a number of additional options for advanced use cases.
14+
15+
## Global settings {#global-settings}
16+
17+
There are a small number of settings that control ClickHouse Connect behavior globally. They are accessed from the top level `common` package:
18+
19+
```python
20+
from clickhouse_connect import common
21+
22+
common.set_setting('autogenerate_session_id', False)
23+
common.get_setting('invalid_setting_action')
24+
'drop'
25+
```
26+
27+
:::note
28+
These common settings `autogenerate_session_id`, `product_name`, and `readonly` should _always_ be modified before creating a client with the `clickhouse_connect.get_client` method. Changing these settings after client creation does not affect the behavior of existing clients.
29+
:::
30+
31+
The following global settings are currently defined:
32+
33+
| Setting Name | Default | Options | Description |
34+
|-------------------------------------|---------|-------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
35+
| autogenerate_session_id | True | True, False | Autogenerate a new UUID(1) session ID (if not provided) for each client session. If no session ID is provided (either at the client or query level), ClickHouse will generate a random internal ID for each query. |
36+
| dict_parameter_format | 'json' | 'json', 'map' | This controls whether parameterized queries convert a Python dictionary to JSON or ClickHouse Map syntax. `json` should be used for inserts into JSON columns, `map` for ClickHouse Map columns. |
37+
| invalid_setting_action | 'error' | 'drop', 'send', 'error' | Action to take when an invalid or readonly setting is provided (either for the client session or query). If `drop`, the setting will be ignored, if `send`, the setting will be sent to ClickHouse, if `error` a client side ProgrammingError will be raised. |
38+
| max_connection_age | 600 | | Maximum seconds that an HTTP Keep Alive connection will be kept open/reused. This prevents bunching of connections against a single ClickHouse node behind a load balancer/proxy. Defaults to 10 minutes. |
39+
| product_name | | | A string that is passed with the query to ClickHouse for tracking the app using ClickHouse Connect. Should be in the form <product name;&gl/<product version>. |
40+
| readonly | 0 | 0, 1 | Implied "read_only" ClickHouse settings for versions prior to 19.17. Can be set to match the ClickHouse "read_only" value for settings to allow operation with very old ClickHouse versions. |
41+
| send_os_user | True | True, False | Include the detected operating system user in client information sent to ClickHouse (HTTP User-Agent string). |
42+
| send_integration_tags | True | True, False | Include the used integration libraries/version (e.g. Pandas/SQLAlchemy/etc.) in client information sent to ClickHouse (HTTP User-Agent string). |
43+
| use_protocol_version | True | True, False | Use the client protocol version. This is needed for `DateTime` timezone columns but breaks with the current version of chproxy. |
44+
| max_error_size | 1024 | | Maximum number of characters that will be returned in a client error messages. Use 0 for this setting to get the full ClickHouse error message. Defaults to 1024 characters. |
45+
| http_buffer_size | 10MB | | Size (in bytes) of the "in-memory" buffer used for HTTP streaming queries. |
46+
| preserve_pandas_datetime_resolution | False | True, False | When True and using pandas 2.x, preserves the datetime64/timedelta64 dtype resolution (e.g., 's', 'ms', 'us', 'ns'). If False (or on pandas <2.x), coerces to nanosecond ('ns') resolution for compatibility. |
47+
48+
## Compression {#compression}
49+
50+
ClickHouse Connect supports lz4, zstd, brotli, and gzip compression for both query results and inserts. Always keep in mind that using compression usually involves a tradeoff between network bandwidth/transfer speed against CPU usage (both on the client and the server.)
51+
52+
To receive compressed data, the ClickHouse server `enable_http_compression` must be set to 1, or the user must have permission to change the setting on a "per query" basis.
53+
54+
Compression is controlled by the `compress` parameter when calling the `clickhouse_connect.get_client` factory method. By default, `compress` is set to `True`, which will trigger the default compression settings. For queries executed with the `query`, `query_np`, and `query_df` client methods, ClickHouse Connect will add the `Accept-Encoding` header with
55+
the `lz4`, `zstd`, `br` (brotli, if the brotli library is installed), `gzip`, and `deflate` encodings to queries executed with the `query` client method (and indirectly, `query_np` and `query_df`). (For the majority of requests the ClickHouse
56+
server will return with a `zstd` compressed payload.) For inserts, by default ClickHouse Connect will compress insert blocks with `lz4` compression, and send the `Content-Encoding: lz4` HTTP header.
57+
58+
The `get_client` `compress` parameter can also be set to a specific compression method, one of `lz4`, `zstd`, `br`, or `gzip`. That method will then be used for both inserts and query results (if supported by the ClickHouse server.) The required `zstd` and `lz4` compression libraries are now installed by default with ClickHouse Connect. If `br`/brotli is specified, the brotli library must be installed separately.
59+
60+
Note that the `raw*` client methods don't use the compression specified by the client configuration.
61+
62+
We also recommend against using `gzip` compression, as it is significantly slower than the alternatives for both compressing and decompressing data.
63+
64+
## HTTP proxy support {#http-proxy-support}
65+
66+
ClickHouse Connect adds basic HTTP proxy support using the `urllib3` library. It recognizes the standard `HTTP_PROXY` and `HTTPS_PROXY` environment variables. Note that using these environment variables will apply to any client created with the `clickhouse_connect.get_client` method. Alternatively, to configure per client, you can use the `http_proxy` or `https_proxy` arguments to the get_client method. For details on the implementation of HTTP Proxy support, see the [urllib3](https://urllib3.readthedocs.io/en/stable/advanced-usage.html#http-and-https-proxies) documentation.
67+
68+
To use a SOCKS proxy, you can send a `urllib3` `SOCKSProxyManager` as the `pool_mgr` argument to `get_client`. Note that this will require installing the PySocks library either directly or using the `[socks]` option for the `urllib3` dependency.
69+
70+
## "Old" JSON data type {#old-json-data-type}
71+
72+
The experimental `Object` (or `Object('json')`) data type is deprecated and should be avoided in a production environment. ClickHouse Connect continues to provide limited support for the data type for backward compatibility. Note that this support does not include queries that are expected to return "top level" or "parent" JSON values as dictionaries or the equivalent, and such queries will result in an exception.
73+
74+
## "New" Variant/Dynamic/JSON datatypes (experimental feature) {#new-variantdynamicjson-datatypes-experimental-feature}
75+
76+
Beginning with the 0.8.0 release, `clickhouse-connect` provides experimental support for the new (also experimental) ClickHouse types Variant, Dynamic, and JSON.
77+
78+
### Usage notes {#usage-notes}
79+
- JSON data can be inserted as either a Python dictionary or a JSON string containing a JSON object `{}`. Other forms of JSON data are not supported.
80+
- Queries using subcolumns/paths for these types will return the type of the sub column.
81+
- See the main ClickHouse [documentation](https://clickhouse.com/docs) for other usage notes.
82+
83+
### Known limitations {#known-limitations}
84+
- Each of these types must be enabled in the ClickHouse settings before using.
85+
- The "new" JSON type is available starting with the ClickHouse 24.8 release
86+
- Due to internal format changes, `clickhouse-connect` is only compatible with Variant types beginning with the ClickHouse 24.7 release
87+
- Returned JSON objects will only return the `max_dynamic_paths` number of elements (which defaults to 1024). This will be fixed in a future release.
88+
- Inserts into `Dynamic` columns will always be the String representation of the Python value. This will be fixed in a future release, once https://github.com/ClickHouse/ClickHouse/issues/70395 has been fixed.
89+
- The implementation for the new types has not been optimized in C code, so performance may be somewhat slower than for simpler, established data types.

0 commit comments

Comments
 (0)