You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# ClickHouse Connect driver API {#clickhouse-connect-driver-api}
11
11
12
-
***Note:*** Passing keyword arguments is recommended for most api methods given the number of possible arguments, most of which are optional.
12
+
:::note
13
+
Passing keyword arguments is recommended for most api methods given the number of possible arguments, most of which are optional.
13
14
14
15
*Methods not documented here are not considered part of the API, and may be removed or changed.*
16
+
:::
15
17
16
18
## Client Initialization {#client-initialization}
17
19
@@ -332,7 +334,7 @@ Use the `Client.command` method to send SQL queries to the ClickHouse server tha
332
334
| data | str or bytes |*None*| Optional data to include with the command as the POST body. |
333
335
| settings | dict |*None*| See [settings description](#settings-argument). |
334
336
| use_database | bool | True | Use the client database (specified when creating the client). False means the command will use the default ClickHouse server database for the connected user. |
335
-
| external_data | ExternalData |*None*| An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](#external-data)|
337
+
| external_data | ExternalData |*None*| An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](advanced-querying.md#external-data)|
336
338
337
339
-`command` can be used for DDL statements. If the SQL "command" does not return data, a "query summary" dictionary is returned instead. This dictionary encapsulates the ClickHouse X-ClickHouse-Summary and X-ClickHouse-Query-Id headers, including the key/value pairs `written_rows`,`written_bytes`, and `query_id`.
338
340
@@ -367,8 +369,8 @@ The `Client.query` method is the primary way to retrieve a single "batch" datase
367
369
| query_tz | str |*None*| A timezone name from the `zoneinfo` database. This timezone will be applied to all datetime or Pandas Timestamp objects returned by the query. |
368
370
| column_tzs | dict |*None*| A dictionary of column name to timezone name. Like `query_tz`, but allows specifying different timezones for different columns. |
369
371
| use_extended_dtypes | bool | True | Use Pandas extended dtypes (like StringArray), and pandas.NA and pandas.NaT for ClickHouse NULL values. Applies only to `query_df` and `query_df_stream` methods. |
370
-
| external_data | ExternalData |*None*| An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](#external-data)|
371
-
| context | QueryContext |*None*| A reusable QueryContext object can be used to encapsulate the above method arguments. See [Advanced Queries (QueryContexts)](#querycontexts)|
372
+
| external_data | ExternalData |*None*| An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](advanced-querying.md#external-data)|
373
+
| context | QueryContext |*None*| A reusable QueryContext object can be used to encapsulate the above method arguments. See [Advanced Queries (QueryContexts)](advanced-querying.md#querycontexts)|
372
374
373
375
### The `QueryResult` object {#the-queryresult-object}
374
376
@@ -389,7 +391,7 @@ The base `query` method returns a `QueryResult` object with the following public
389
391
390
392
The `*_stream` properties return a Python Context that can be used as an iterator for the returned data. They should only be accessed indirectly using the Client `*_stream` methods.
391
393
392
-
The complete details of streaming query results (using StreamContext objects) are outlined in [Advanced Queries (Streaming Queries)](#streaming-queries).
394
+
The complete details of streaming query results (using StreamContext objects) are outlined in [Advanced Queries (Streaming Queries)](advanced-querying.md#streaming-queries).
393
395
394
396
## Consuming query results with NumPy, Pandas or Arrow {#consuming-query-results-with-numpy-pandas-or-arrow}
395
397
@@ -412,7 +414,7 @@ The ClickHouse Connect Client provides multiple methods for retrieving data as a
412
414
-`query_arrow_stream` -- Returns query data in PyArrow RecordBlocks
413
415
-`query_df_arrow_stream` -- Returns each ClickHouse Block of query data as an arrow-backed Pandas DataFrame or a Polars DataFrame depending on the kwarg `dataframe_library` (default is "pandas").
414
416
415
-
Each of these methods returns a `ContextStream` object that must be opened via a `with` statement to start consuming the stream. See [Advanced Queries (Streaming Queries)](#streaming-queries) for details and examples.
417
+
Each of these methods returns a `ContextStream` object that must be opened via a `with` statement to start consuming the stream. See [Advanced Queries (Streaming Queries)](advanced-querying.md#streaming-queries) for details and examples.
416
418
417
419
## Pandas and Polars {#pandas-and-polars}
418
420
@@ -492,18 +494,20 @@ For the common use case of inserting multiple records into ClickHouse, there is
492
494
| column_type_names | Sequence of ClickHouse type names |*None*| A list of ClickHouse datatype names. If neither column_types or column_type_names is specified, ClickHouse Connect will execute a "pre-query" to retrieve all the column types for the table. |
493
495
| column_oriented | bool | False | If True, the `data` argument is assumed to be a Sequence of columns (and no "pivot" will be necessary to insert the data). Otherwise `data` is interpreted as a Sequence of rows. |
494
496
| settings | dict |*None*| See [settings description](#settings-argument). |
495
-
| context | InsertContext |*None*| A reusable InsertContext object can be used to encapsulate the above method arguments. See [Advanced Inserts (InsertContexts)](#insertcontexts)|
497
+
| context | InsertContext |*None*| A reusable InsertContext object can be used to encapsulate the above method arguments. See [Advanced Inserts (InsertContexts)](advanced-inserting.md#insertcontexts)|
This method returns a "query summary" dictionary as described under the "command" method. An exception will be raised if the insert fails for any reason.
499
501
500
-
There are two specialized versions of the main `insert` method:
502
+
There are three specialized versions of the main `insert` method:
501
503
502
504
-`insert_df` -- Instead of Python Sequence of Sequences `data` argument, the second parameter of this method requires a `df` argument that must be a Pandas DataFrame instance. ClickHouse Connect automatically processes the DataFrame as a column oriented datasource, so the `column_oriented` parameter is not required or available.
503
505
-`insert_arrow` -- Instead of a Python Sequence of Sequences `data` argument, this method requires an `arrow_table`. ClickHouse Connect passes the Arrow table unmodified to the ClickHouse server for processing, so only the `database` and `settings` arguments are available in addition to `table` and `arrow_table`.
504
506
-`insert_df_arrow` -- Instead of a Python Sequence of Sequences `data` argument, the second parameter of this method requires a `df` that must be an arrow-backed Pandas DataFrame or a Polars DataFrame instance. ClickHouse Connect will automatically determine if the DataFrame is a Pandas or Polars type. If Pandas, validation will be performed to ensure that each column's dtype backend is Arrow-based and an error will be raised if any are not.
505
507
506
-
*Note:* A NumPy array is a valid Sequence of Sequences and can be used as the `data` argument to the main `insert` method, so a specialized method is not required.
508
+
:::note
509
+
A NumPy array is a valid Sequence of Sequences and can be used as the `data` argument to the main `insert` method, so a specialized method is not required.
510
+
:::
507
511
508
512
## File Inserts {#file-inserts}
509
513
@@ -570,14 +574,14 @@ For use cases which do not require transformation between ClickHouse data and na
570
574
571
575
The `Client.raw_query` method allows direct usage of the ClickHouse HTTP query interface using the client connection. The return value is an unprocessed `bytes` object. It offers a convenient wrapper with parameter binding, error handling, retries, and settings management using a minimal interface:
| query | str |*Required*| Any valid ClickHouse query |
576
-
| parameters | dict or iterable |*None*| See [parameters description](#parameters-argument). |
577
-
| settings | dict |*None*| See [settings description](#settings-argument). |
578
-
| fmt | str |*None*| ClickHouse Output Format for the resulting bytes. (ClickHouse uses TSV if not specified) |
579
-
| use_database | bool | True | Use the ClickHouse Connect client-assigned database for the query context |
580
-
| external_data | ExternalData |*None*| An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](#external-data)|
| query | str |*Required*| Any valid ClickHouse query |
580
+
| parameters | dict or iterable |*None*| See [parameters description](#parameters-argument). |
581
+
| settings | dict |*None*| See [settings description](#settings-argument). |
582
+
| fmt | str |*None*| ClickHouse Output Format for the resulting bytes. (ClickHouse uses TSV if not specified) |
583
+
| use_database | bool | True | Use the ClickHouse Connect client-assigned database for the query context |
584
+
| external_data | ExternalData |*None*| An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](advanced-querying.md#external-data)|
581
585
582
586
It is the caller's responsibility to handle the resulting `bytes` object. Note that the `Client.query_arrow` is just a thin wrapper around this method using the ClickHouse `Arrow` output format.
0 commit comments