Skip to content

Commit 9633c25

Browse files
edits
1 parent cb8f4e1 commit 9633c25

File tree

1 file changed

+21
-17
lines changed

1 file changed

+21
-17
lines changed

docs/integrations/language-clients/python/driver-api.md

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,11 @@ title: 'ClickHouse Connect Driver API'
99

1010
# ClickHouse Connect driver API {#clickhouse-connect-driver-api}
1111

12-
***Note:*** Passing keyword arguments is recommended for most api methods given the number of possible arguments, most of which are optional.
12+
:::note
13+
Passing keyword arguments is recommended for most api methods given the number of possible arguments, most of which are optional.
1314

1415
*Methods not documented here are not considered part of the API, and may be removed or changed.*
16+
:::
1517

1618
## Client Initialization {#client-initialization}
1719

@@ -332,7 +334,7 @@ Use the `Client.command` method to send SQL queries to the ClickHouse server tha
332334
| data | str or bytes | *None* | Optional data to include with the command as the POST body. |
333335
| settings | dict | *None* | See [settings description](#settings-argument). |
334336
| use_database | bool | True | Use the client database (specified when creating the client). False means the command will use the default ClickHouse server database for the connected user. |
335-
| external_data | ExternalData | *None* | An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](#external-data) |
337+
| external_data | ExternalData | *None* | An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](advanced-querying.md#external-data) |
336338

337339
- `command` can be used for DDL statements. If the SQL "command" does not return data, a "query summary" dictionary is returned instead. This dictionary encapsulates the ClickHouse X-ClickHouse-Summary and X-ClickHouse-Query-Id headers, including the key/value pairs `written_rows`,`written_bytes`, and `query_id`.
338340

@@ -367,8 +369,8 @@ The `Client.query` method is the primary way to retrieve a single "batch" datase
367369
| query_tz | str | *None* | A timezone name from the `zoneinfo` database. This timezone will be applied to all datetime or Pandas Timestamp objects returned by the query. |
368370
| column_tzs | dict | *None* | A dictionary of column name to timezone name. Like `query_tz`, but allows specifying different timezones for different columns. |
369371
| use_extended_dtypes | bool | True | Use Pandas extended dtypes (like StringArray), and pandas.NA and pandas.NaT for ClickHouse NULL values. Applies only to `query_df` and `query_df_stream` methods. |
370-
| external_data | ExternalData | *None* | An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](#external-data) |
371-
| context | QueryContext | *None* | A reusable QueryContext object can be used to encapsulate the above method arguments. See [Advanced Queries (QueryContexts)](#querycontexts) |
372+
| external_data | ExternalData | *None* | An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](advanced-querying.md#external-data) |
373+
| context | QueryContext | *None* | A reusable QueryContext object can be used to encapsulate the above method arguments. See [Advanced Queries (QueryContexts)](advanced-querying.md#querycontexts) |
372374

373375
### The `QueryResult` object {#the-queryresult-object}
374376

@@ -389,7 +391,7 @@ The base `query` method returns a `QueryResult` object with the following public
389391

390392
The `*_stream` properties return a Python Context that can be used as an iterator for the returned data. They should only be accessed indirectly using the Client `*_stream` methods.
391393

392-
The complete details of streaming query results (using StreamContext objects) are outlined in [Advanced Queries (Streaming Queries)](#streaming-queries).
394+
The complete details of streaming query results (using StreamContext objects) are outlined in [Advanced Queries (Streaming Queries)](advanced-querying.md#streaming-queries).
393395

394396
## Consuming query results with NumPy, Pandas or Arrow {#consuming-query-results-with-numpy-pandas-or-arrow}
395397

@@ -412,7 +414,7 @@ The ClickHouse Connect Client provides multiple methods for retrieving data as a
412414
- `query_arrow_stream` -- Returns query data in PyArrow RecordBlocks
413415
- `query_df_arrow_stream` -- Returns each ClickHouse Block of query data as an arrow-backed Pandas DataFrame or a Polars DataFrame depending on the kwarg `dataframe_library` (default is "pandas").
414416

415-
Each of these methods returns a `ContextStream` object that must be opened via a `with` statement to start consuming the stream. See [Advanced Queries (Streaming Queries)](#streaming-queries) for details and examples.
417+
Each of these methods returns a `ContextStream` object that must be opened via a `with` statement to start consuming the stream. See [Advanced Queries (Streaming Queries)](advanced-querying.md#streaming-queries) for details and examples.
416418

417419
## Pandas and Polars {#pandas-and-polars}
418420

@@ -492,18 +494,20 @@ For the common use case of inserting multiple records into ClickHouse, there is
492494
| column_type_names | Sequence of ClickHouse type names | *None* | A list of ClickHouse datatype names. If neither column_types or column_type_names is specified, ClickHouse Connect will execute a "pre-query" to retrieve all the column types for the table. |
493495
| column_oriented | bool | False | If True, the `data` argument is assumed to be a Sequence of columns (and no "pivot" will be necessary to insert the data). Otherwise `data` is interpreted as a Sequence of rows. |
494496
| settings | dict | *None* | See [settings description](#settings-argument). |
495-
| context | InsertContext | *None* | A reusable InsertContext object can be used to encapsulate the above method arguments. See [Advanced Inserts (InsertContexts)](#insertcontexts) |
497+
| context | InsertContext | *None* | A reusable InsertContext object can be used to encapsulate the above method arguments. See [Advanced Inserts (InsertContexts)](advanced-inserting.md#insertcontexts) |
496498
| transport_settings | dict | *None* | Optional dictionary of transport-level settings (HTTP headers, etc.) |
497499

498500
This method returns a "query summary" dictionary as described under the "command" method. An exception will be raised if the insert fails for any reason.
499501

500-
There are two specialized versions of the main `insert` method:
502+
There are three specialized versions of the main `insert` method:
501503

502504
- `insert_df` -- Instead of Python Sequence of Sequences `data` argument, the second parameter of this method requires a `df` argument that must be a Pandas DataFrame instance. ClickHouse Connect automatically processes the DataFrame as a column oriented datasource, so the `column_oriented` parameter is not required or available.
503505
- `insert_arrow` -- Instead of a Python Sequence of Sequences `data` argument, this method requires an `arrow_table`. ClickHouse Connect passes the Arrow table unmodified to the ClickHouse server for processing, so only the `database` and `settings` arguments are available in addition to `table` and `arrow_table`.
504506
- `insert_df_arrow` -- Instead of a Python Sequence of Sequences `data` argument, the second parameter of this method requires a `df` that must be an arrow-backed Pandas DataFrame or a Polars DataFrame instance. ClickHouse Connect will automatically determine if the DataFrame is a Pandas or Polars type. If Pandas, validation will be performed to ensure that each column's dtype backend is Arrow-based and an error will be raised if any are not.
505507

506-
*Note:* A NumPy array is a valid Sequence of Sequences and can be used as the `data` argument to the main `insert` method, so a specialized method is not required.
508+
:::note
509+
A NumPy array is a valid Sequence of Sequences and can be used as the `data` argument to the main `insert` method, so a specialized method is not required.
510+
:::
507511

508512
## File Inserts {#file-inserts}
509513

@@ -570,14 +574,14 @@ For use cases which do not require transformation between ClickHouse data and na
570574

571575
The `Client.raw_query` method allows direct usage of the ClickHouse HTTP query interface using the client connection. The return value is an unprocessed `bytes` object. It offers a convenient wrapper with parameter binding, error handling, retries, and settings management using a minimal interface:
572576

573-
| Parameter | Type | Default | Description |
574-
|---------------|------------------|------------|-------------------------------------------------------------------------------------------------------------------------------------|
575-
| query | str | *Required* | Any valid ClickHouse query |
576-
| parameters | dict or iterable | *None* | See [parameters description](#parameters-argument). |
577-
| settings | dict | *None* | See [settings description](#settings-argument). |
578-
| fmt | str | *None* | ClickHouse Output Format for the resulting bytes. (ClickHouse uses TSV if not specified) |
579-
| use_database | bool | True | Use the ClickHouse Connect client-assigned database for the query context |
580-
| external_data | ExternalData | *None* | An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](#external-data) |
577+
| Parameter | Type | Default | Description |
578+
|---------------|------------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
579+
| query | str | *Required* | Any valid ClickHouse query |
580+
| parameters | dict or iterable | *None* | See [parameters description](#parameters-argument). |
581+
| settings | dict | *None* | See [settings description](#settings-argument). |
582+
| fmt | str | *None* | ClickHouse Output Format for the resulting bytes. (ClickHouse uses TSV if not specified) |
583+
| use_database | bool | True | Use the ClickHouse Connect client-assigned database for the query context |
584+
| external_data | ExternalData | *None* | An ExternalData object containing file or binary data to use with the query. See [Advanced Queries (External Data)](advanced-querying.md#external-data) |
581585

582586
It is the caller's responsibility to handle the resulting `bytes` object. Note that the `Client.query_arrow` is just a thin wrapper around this method using the ClickHouse `Arrow` output format.
583587

0 commit comments

Comments
 (0)