From 056b9399d8b5461bbdfd2afde19536f6c5d5ee72 Mon Sep 17 00:00:00 2001 From: Melvyn Peignon Date: Sun, 9 Nov 2025 01:03:39 +0100 Subject: [PATCH 1/4] Create onelake_catalog.md --- docs/use-cases/data_lake/onelake_catalog.md | 171 ++++++++++++++++++++ 1 file changed, 171 insertions(+) create mode 100644 docs/use-cases/data_lake/onelake_catalog.md diff --git a/docs/use-cases/data_lake/onelake_catalog.md b/docs/use-cases/data_lake/onelake_catalog.md new file mode 100644 index 00000000000..5dbac562c1b --- /dev/null +++ b/docs/use-cases/data_lake/onelake_catalog.md @@ -0,0 +1,171 @@ +--- +slug: /use-cases/data-lake/onelake-catalog +sidebar_label: 'Fabric OneLake' +title: 'Fabric OneLake' +pagination_prev: null +pagination_next: null +description: 'In this guide, we will walk you through the steps to query your data in Microsoft OneLake.' +keywords: ['OneLake', 'Data Lake', 'Fabric'] +show_related_blogs: true +doc_type: 'guide' +--- + +import BetaBadge from '@theme/badges/BetaBadge'; + + + +ClickHouse supports integration with multiple catalogs (OneLake, Unity, Glue, Polaris, etc.). This guide will walk you through the steps to query your data stored in Microsoft OneLake using ClickHouse and [OneLake](https://learn.microsoft.com/en-us/fabric/onelake/onelake-overview). + +Microsoft OneLake supports multiple table formats for their lakehouse. With ClickHouse, you can query Iceberg tables. + +:::note +As this feature is beta, you will need to enable it using: +`SET allow_database_iceberg = 1;` +::: + +## Gathering Requirements OneLake {#gathering-requirements} + +Before being able to query your table in Microsoft you will need to collect the following info: + +- A OneLake tenant ID (Your Entra ID) +- A client ID +- A client secret +- A warehouse id and a data item id + +Microsoft OneLake [has a page](http://learn.microsoft.com/en-us/fabric/onelake/table-apis/table-apis-overview#prerequisites) in their documentation to help you finding the above informations. + +## Creating a connection between OneLake and ClickHouse {#creating-a-connection-between-unity-catalog-and-clickhouse} + +With the required info above you can now create a connection between Microsoft OneLake and ClickHouse, but before that you need to enable catalogs: + +```sql +SET allow_database_iceberg=1 +``` + +### Connect to OneLake {#connect-onelake} + +```sql +CREATE DATABASE onelake_catalog +ENGINE = DataLakeCatalog('https://onelake.table.fabric.microsoft.com/iceberg') +SETTINGS +catalog_type = 'onelake', +warehouse = 'warehouse_id/data_item_id', +onelake_tenant_id = '', +oauth_server_uri = 'https://login.microsoftonline.com//oauth2/v2.0/token', +auth_scope = 'https://storage.azure.com/.default', +onelake_client_id = '', +onelake_client_secret = '' +``` + +## Querying OneLake using ClickHouse {#querying-onelake-using-clickhouse} + +Now that the connection is in place, you can start querying OneLake: + +```sql +SHOW TABLES FROM onelake_catalog + +Query id: 8f6124c4-45c2-4351-b49a-89dc13e548a7 + + ┌─name──────────────────────────┐ +1. │ year_2017.green_tripdata_2017 │ +2. │ year_2018.green_tripdata_2018 │ +3. │ year_2019.green_tripdata_2019 │ +4. │ year_2020.green_tripdata_2020 │ +5. │ year_2022.green_tripdata_2022 │ + └───────────────────────────────┘ +``` + +If you're using the Iceberg client, only the Delta tables with Uniform-enabled will be shown: + +To query a table: + +```sql +SELECT * +FROM onelake_catalog.`year_2017.green_tripdata_2017` +LIMIT 1 + +Query id: db6b4bda-cc58-4ca1-8891-e0d14f02c890 + +Row 1: +────── +VendorID: 2 +lpep_pickup_datetime: 2017-05-18 16:55:43.000000 +lpep_dropoff_datetime: 2017-05-18 18:04:11.000000 +store_and_fwd_flag: N +RatecodeID: 2 +PULocationID: 130 +DOLocationID: 48 +passenger_count: 2 +trip_distance: 12.43 +fare_amount: 52 +extra: 4.5 +mta_tax: 0.5 +tip_amount: 0 +tolls_amount: 33 +ehail_fee: ᴺᵁᴸᴸ +improvement_surcharge: 0.3 +total_amount: 90.3 +payment_type: 2 +trip_type: 1 +congestion_surcharge: ᴺᵁᴸᴸ +source_file: green_tripdata_2017-05.parquet +``` + +:::note Backticks required +Backticks are required because ClickHouse doesn't support more than one namespace. +::: + +To inspect the table DDL: + +```sql +SHOW CREATE TABLE onelake_catalog.`year_2017.green_tripdata_2017` + +Query id: 8bd5bd8e-83be-453e-9a88-32de12ba7f24 + + ┌─statement───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ +1. │ CREATE TABLE onelake_catalog.`year_2017.green_tripdata_2017` ↴│ + │↳( ↴│ + │↳ `VendorID` Nullable(Int64), ↴│ + │↳ `lpep_pickup_datetime` Nullable(DateTime64(6, 'UTC')), ↴│ + │↳ `lpep_dropoff_datetime` Nullable(DateTime64(6, 'UTC')), ↴│ + │↳ `store_and_fwd_flag` Nullable(String), ↴│ + │↳ `RatecodeID` Nullable(Int64), ↴│ + │↳ `PULocationID` Nullable(Int64), ↴│ + │↳ `DOLocationID` Nullable(Int64), ↴│ + │↳ `passenger_count` Nullable(Int64), ↴│ + │↳ `trip_distance` Nullable(Float64), ↴│ + │↳ `fare_amount` Nullable(Float64), ↴│ + │↳ `extra` Nullable(Float64), ↴│ + │↳ `mta_tax` Nullable(Float64), ↴│ + │↳ `tip_amount` Nullable(Float64), ↴│ + │↳ `tolls_amount` Nullable(Float64), ↴│ + │↳ `ehail_fee` Nullable(Float64), ↴│ + │↳ `improvement_surcharge` Nullable(Float64), ↴│ + │↳ `total_amount` Nullable(Float64), ↴│ + │↳ `payment_type` Nullable(Int64), ↴│ + │↳ `trip_type` Nullable(Int64), ↴│ + │↳ `congestion_surcharge` Nullable(Float64), ↴│ + │↳ `source_file` Nullable(String) ↴│ + │↳) ↴│ + │↳ENGINE = Iceberg('abfss://@onelake.dfs.fabric.microsoft.com//Tables/year_2017/green_tripdata_2017') │ + └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ +``` + +## Loading data from your Data Lake into ClickHouse {#loading-data-from-onelake-into-clickhouse} + +If you need to load data from OneLake into ClickHouse: + +```sql +CREATE TABLE trips +ENGINE = MergeTree +ORDER BY coalesce(VendorID, 0) +AS SELECT * +FROM onelake_catalog.`year_2017.green_tripdata_2017` + +Query id: d15983a6-ef6a-40fe-80d5-19274b9fe328 + +Ok. + +0 rows in set. Elapsed: 32.570 sec. Processed 11.74 million rows, 275.37 MB (360.36 thousand rows/s., 8.45 MB/s.) +Peak memory usage: 1.31 GiB. +``` From 5a829420960817854ef620b96ed0d72e10b077f1 Mon Sep 17 00:00:00 2001 From: Dominic Tran Date: Sat, 8 Nov 2025 18:55:28 -0600 Subject: [PATCH 2/4] adding OneLake to aspell --- scripts/aspell-ignore/en/aspell-dict.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripts/aspell-ignore/en/aspell-dict.txt b/scripts/aspell-ignore/en/aspell-dict.txt index 47912a4586a..8928c71c64d 100644 --- a/scripts/aspell-ignore/en/aspell-dict.txt +++ b/scripts/aspell-ignore/en/aspell-dict.txt @@ -1,4 +1,4 @@ -personal_ws-1.1 en 3818 +personal_ws-1.1 en 3819 AArch ACLs AICPA @@ -3831,3 +3831,4 @@ SpanName lucene TrackedLink eventName +OneLake From d5ab627ec5b3769a7484b4f87767a850f1f08537 Mon Sep 17 00:00:00 2001 From: Melvyn Peignon Date: Sun, 9 Nov 2025 02:06:16 +0100 Subject: [PATCH 3/4] Update docs/use-cases/data_lake/onelake_catalog.md Co-authored-by: Dominic Tran --- docs/use-cases/data_lake/onelake_catalog.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/use-cases/data_lake/onelake_catalog.md b/docs/use-cases/data_lake/onelake_catalog.md index 5dbac562c1b..158d6b56086 100644 --- a/docs/use-cases/data_lake/onelake_catalog.md +++ b/docs/use-cases/data_lake/onelake_catalog.md @@ -25,14 +25,14 @@ As this feature is beta, you will need to enable it using: ## Gathering Requirements OneLake {#gathering-requirements} -Before being able to query your table in Microsoft you will need to collect the following info: +Before querying your table in Microsoft Fabric, you'll need to collect the following information: - A OneLake tenant ID (Your Entra ID) - A client ID - A client secret -- A warehouse id and a data item id +- A warehouse ID and a data item ID -Microsoft OneLake [has a page](http://learn.microsoft.com/en-us/fabric/onelake/table-apis/table-apis-overview#prerequisites) in their documentation to help you finding the above informations. +See [Microsoft OneLake's documentation](http://learn.microsoft.com/en-us/fabric/onelake/table-apis/table-apis-overview#prerequisites) for help finding these values. ## Creating a connection between OneLake and ClickHouse {#creating-a-connection-between-unity-catalog-and-clickhouse} From f5d0e1f3dd3a9f54fc33c706139a0f5f53a92b96 Mon Sep 17 00:00:00 2001 From: Dominic Tran Date: Sat, 8 Nov 2025 19:13:35 -0600 Subject: [PATCH 4/4] adding OneLake's to aspell --- scripts/aspell-ignore/en/aspell-dict.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripts/aspell-ignore/en/aspell-dict.txt b/scripts/aspell-ignore/en/aspell-dict.txt index 8928c71c64d..076b4c879f9 100644 --- a/scripts/aspell-ignore/en/aspell-dict.txt +++ b/scripts/aspell-ignore/en/aspell-dict.txt @@ -1,4 +1,4 @@ -personal_ws-1.1 en 3819 +personal_ws-1.1 en 3820 AArch ACLs AICPA @@ -3832,3 +3832,4 @@ lucene TrackedLink eventName OneLake +OneLake's