From 55b96becfe1b54e0e6531cbe177a5786923249dc Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 10 Nov 2025 06:25:44 +0000 Subject: [PATCH 1/3] Initial plan From e55f7bc5ecb6b375a509daec3d915650f672107f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 10 Nov 2025 06:28:40 +0000 Subject: [PATCH 2/3] docs: add parallelism parameter documentation for manual compaction - Update English admin.md with parallelism examples - Update English compaction.md with detailed parallelism usage - Update Chinese admin.md with parallelism examples - Update Chinese compaction.md with detailed parallelism usage Co-authored-by: fengjiachun <3860496+fengjiachun@users.noreply.github.com> --- docs/reference/sql/admin.md | 11 ++++++++- .../manage-data/compaction.md | 23 ++++++++++++++++++- .../current/reference/sql/admin.md | 11 ++++++++- .../manage-data/compaction.md | 23 ++++++++++++++++++- 4 files changed, 64 insertions(+), 4 deletions(-) diff --git a/docs/reference/sql/admin.md b/docs/reference/sql/admin.md index 8cd24ad5b6..9342a49231 100644 --- a/docs/reference/sql/admin.md +++ b/docs/reference/sql/admin.md @@ -32,6 +32,15 @@ For example: -- Flush the table test -- admin flush_table("test"); --- Schedule a compaction for table test -- +-- Schedule a compaction for table test with default parallelism (1) -- admin compact_table("test"); + +-- Schedule a regular compaction with parallelism set to 2 -- +admin compact_table("test", "regular", "parallelism=2"); + +-- Schedule an SWCS compaction with default time window and parallelism set to 2 -- +admin compact_table("test", "swcs", "parallelism=2"); + +-- Schedule an SWCS compaction with custom time window and parallelism -- +admin compact_table("test", "swcs", "window=1800,parallelism=2"); ``` diff --git a/docs/user-guide/deployments-administration/manage-data/compaction.md b/docs/user-guide/deployments-administration/manage-data/compaction.md index 02d024567e..ff2a3df727 100644 --- a/docs/user-guide/deployments-administration/manage-data/compaction.md +++ b/docs/user-guide/deployments-administration/manage-data/compaction.md @@ -111,7 +111,11 @@ ADMIN COMPACT_TABLE( ``` The `` parameter can be either `twcs` or `swcs` (case insensitive) which refer to Time Windowed Compaction Strategy and Strict Window Compaction Strategy respectively. -For the `swcs` strategy, the `` specify the window size (in seconds) for splitting SST files. For example: +For the `swcs` strategy, the `` can specify: +- The window size (in seconds) for splitting SST files +- The `parallelism` parameter to control the level of parallelism for compaction (defaults to 1) + +For example, to trigger compaction with a 1-hour window: ```sql ADMIN COMPACT_TABLE( @@ -130,6 +134,23 @@ ADMIN COMPACT_TABLE( When executing this statement, GreptimeDB will split each SST file into segments with a time span of 1 hour (3600 seconds) and merge these segments into a single output, ensuring no overlapping files remain. +You can also specify the `parallelism` parameter to speed up compaction by processing multiple files concurrently: + +```sql +-- SWCS compaction with default time window and parallelism set to 2 +ADMIN COMPACT_TABLE("monitor", "swcs", "parallelism=2"); + +-- SWCS compaction with custom time window and parallelism +ADMIN COMPACT_TABLE("monitor", "swcs", "window=1800,parallelism=2"); +``` + +The `parallelism` parameter is also available for regular compaction: + +```sql +-- Regular compaction with parallelism set to 2 +ADMIN COMPACT_TABLE("monitor", "regular", "parallelism=2"); +``` + The following diagram shows the process of strict window compression: In Figure A, there are 3 overlapping SST files: `[0, 3]` (which includes timestamps 0, 1, 2, and 3), `[3, 8]`, and `[8, 10]`. diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/admin.md b/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/admin.md index 569059d3e5..bccf7efe37 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/admin.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/admin.md @@ -31,6 +31,15 @@ GreptimeDB 提供了一些管理函数来管理数据库和数据: -- 刷新表 test -- admin flush_table("test"); --- 为表 test 启动 compaction 任务 -- +-- 为表 test 启动 compaction 任务,默认并行度为 1 -- admin compact_table("test"); + +-- 启动常规 compaction,并行度设置为 2 -- +admin compact_table("test", "regular", "parallelism=2"); + +-- 启动 SWCS compaction,使用默认时间窗口,并行度设置为 2 -- +admin compact_table("test", "swcs", "parallelism=2"); + +-- 启动 SWCS compaction,自定义时间窗口和并行度 -- +admin compact_table("test", "swcs", "window=1800,parallelism=2"); ``` diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md index d45fa36129..eee1ebeec2 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md @@ -108,7 +108,11 @@ ADMIN COMPACT_TABLE( ``` `` 参数可以是 `twcs` 或 `swcs`(大小写不敏感),分别指定时间窗口压缩策略和严格窗口压缩策略。 -对于 `swcs` 策略, `` 指定用于拆分 SST 文件的窗口大小(以秒为单位)。例如: +对于 `swcs` 策略, `` 可以指定: +- 用于拆分 SST 文件的窗口大小(以秒为单位) +- `parallelism` 参数用于控制压缩的并行度(默认为 1) + +例如,触发使用 1 小时窗口的压缩: ```sql ADMIN COMPACT_TABLE( @@ -127,6 +131,23 @@ ADMIN COMPACT_TABLE( 执行此语句时,GreptimeDB 会将每个 SST 文件按 1 小时(3600 秒)的时间跨度拆分成多个分块,并将这些分块合并为一个输出文件,确保没有重叠的文件。 +您还可以指定 `parallelism` 参数来通过并发处理多个文件以加速压缩: + +```sql +-- 使用默认时间窗口和并行度为 2 的 SWCS 压缩 +ADMIN COMPACT_TABLE("monitor", "swcs", "parallelism=2"); + +-- 使用自定义时间窗口和并行度的 SWCS 压缩 +ADMIN COMPACT_TABLE("monitor", "swcs", "window=1800,parallelism=2"); +``` + +`parallelism` 参数同样适用于常规压缩: + +```sql +-- 并行度为 2 的常规压缩 +ADMIN COMPACT_TABLE("monitor", "regular", "parallelism=2"); +``` + 下图展示了一次 SWCS 压缩的过程: 在图 A 中,有 3 个重叠的 SST 文件,分别是 `[0, 3]`(也就是包含 0、1、2、3 的时间戳)、`[3, 8]` 和 `[8, 10]`。 From 511f8826fbc3664d991c32132f30707608387834 Mon Sep 17 00:00:00 2001 From: jeremyhi Date: Tue, 11 Nov 2025 11:17:00 +0800 Subject: [PATCH 3/3] Update i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md --- .../deployments-administration/manage-data/compaction.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md index eee1ebeec2..fd5f73a9cc 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/deployments-administration/manage-data/compaction.md @@ -131,7 +131,7 @@ ADMIN COMPACT_TABLE( 执行此语句时,GreptimeDB 会将每个 SST 文件按 1 小时(3600 秒)的时间跨度拆分成多个分块,并将这些分块合并为一个输出文件,确保没有重叠的文件。 -您还可以指定 `parallelism` 参数来通过并发处理多个文件以加速压缩: +你还可以指定 `parallelism` 参数来通过并发处理多个文件以加速压缩: ```sql -- 使用默认时间窗口和并行度为 2 的 SWCS 压缩