Skip to content

Commit a106daa

Browse files
Copilotkillme2008
andauthored
docs: Add SSTS_INDEX_META information schema documentation (#2193)
Signed-off-by: Dennis Zhuang <killme2008@gmail.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: killme2008 <14142+killme2008@users.noreply.github.com> Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
1 parent 128a8da commit a106daa

File tree

7 files changed

+258
-2
lines changed

7 files changed

+258
-2
lines changed

docs/reference/sql/information-schema/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,4 @@ There is still lots of work to do for `INFORMATION_SCHEMA`. The tracking [issue]
6363
| [`FLOWS`](./flows.md) | Provides the flow information.|
6464
| [`PROCEDURE_INFO`](./procedure-info.md) | Procedure information.|
6565
| [`PROCESS_LIST`](./process-list.md) | Running queries information.|
66+
| [`SSTS_INDEX_META`](./ssts-index-meta.md) | Provides SST index metadata including inverted indexes, fulltext indexes, and bloom filters.|
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
---
2+
keywords: [SST index metadata, Puffin index, inverted index, fulltext index, bloom filter, index metadata]
3+
description: Provides access to SST (Sorted String Table) index metadata, including information about inverted indexes, fulltext indexes, and bloom filters stored in Puffin format.
4+
---
5+
6+
# SSTS_INDEX_META
7+
8+
The `SSTS_INDEX_META` table provides access to SST (Sorted String Table) index metadata collected from the manifest. This table surfaces information about Puffin index metadata, including inverted indexes, fulltext indexes, and bloom filters.
9+
10+
:::tip NOTE
11+
This table is not available on [GreptimeCloud](https://greptime.cloud/).
12+
:::
13+
14+
```sql
15+
USE INFORMATION_SCHEMA;
16+
DESC SSTS_INDEX_META;
17+
```
18+
19+
The output is as follows:
20+
21+
```sql
22+
+-----------------+--------+-----+------+---------+---------------+
23+
| Column | Type | Key | Null | Default | Semantic Type |
24+
+-----------------+--------+-----+------+---------+---------------+
25+
| table_dir | String | | NO | | FIELD |
26+
| index_file_path | String | | NO | | FIELD |
27+
| region_id | UInt64 | | NO | | FIELD |
28+
| table_id | UInt32 | | NO | | FIELD |
29+
| region_number | UInt32 | | NO | | FIELD |
30+
| region_group | UInt8 | | NO | | FIELD |
31+
| region_sequence | UInt32 | | NO | | FIELD |
32+
| file_id | String | | NO | | FIELD |
33+
| index_file_size | UInt64 | | YES | | FIELD |
34+
| index_type | String | | NO | | FIELD |
35+
| target_type | String | | NO | | FIELD |
36+
| target_key | String | | NO | | FIELD |
37+
| target_json | String | | NO | | FIELD |
38+
| blob_size | UInt64 | | NO | | FIELD |
39+
| meta_json | String | | YES | | FIELD |
40+
| node_id | UInt64 | | YES | | FIELD |
41+
+-----------------+--------+-----+------+---------+---------------+
42+
```
43+
44+
Fields in the `SSTS_INDEX_META` table are described as follows:
45+
46+
- `table_dir`: The directory path of the table.
47+
- `index_file_path`: The full path to the Puffin index file.
48+
- `region_id`: The ID of the region.
49+
- `table_id`: The ID of the table.
50+
- `region_number`: The region number within the table.
51+
- `region_group`: The group identifier for the region.
52+
- `region_sequence`: The sequence number of the region.
53+
- `file_id`: The unique identifier of the index file (UUID).
54+
- `index_file_size`: The size of the index file in bytes.
55+
- `index_type`: The type of index. Possible values include:
56+
- `inverted`: Inverted index for fast term lookups
57+
- `fulltext_bloom`: Combined fulltext and bloom filter index
58+
- `bloom_filter`: Bloom filter for fast membership tests
59+
- `target_type`: The type of target being indexed. Typically `column` for column-based indexes.
60+
- `target_key`: The key identifying the target (e.g., column ID).
61+
- `target_json`: JSON representation of the target configuration, such as `{"column":0}`.
62+
- `blob_size`: The size of the blob data in bytes.
63+
- `meta_json`: JSON metadata containing index-specific information such as:
64+
- For inverted indexes: FST size, bitmap type, segment row count, etc.
65+
- For bloom filters: bloom filter size, row count, segment count
66+
- For fulltext indexes: analyzer type, case sensitivity settings
67+
- `node_id`: The ID of the datanode where the index is located.
68+
69+
## Examples
70+
71+
Query all index metadata:
72+
73+
```sql
74+
SELECT * FROM INFORMATION_SCHEMA.SSTS_INDEX_META;
75+
```
76+
77+
Query index metadata for a specific table by joining with the `TABLES` table:
78+
79+
```sql
80+
SELECT s.*
81+
FROM INFORMATION_SCHEMA.SSTS_INDEX_META s
82+
JOIN INFORMATION_SCHEMA.TABLES t ON s.table_id = t.table_id
83+
WHERE t.table_name = 'my_table';
84+
```
85+
86+
Query only inverted index metadata:
87+
88+
```sql
89+
SELECT table_dir, index_file_path, index_type, target_json, meta_json
90+
FROM INFORMATION_SCHEMA.SSTS_INDEX_META
91+
WHERE index_type = 'inverted';
92+
```
93+
94+
Query index metadata grouped by index type:
95+
96+
```sql
97+
SELECT index_type, COUNT(*) as count, SUM(index_file_size) as total_size
98+
FROM INFORMATION_SCHEMA.SSTS_INDEX_META
99+
GROUP BY index_type;
100+
```
101+
102+
103+
Output example:
104+
105+
```sql
106+
mysql> SELECT * FROM INFORMATION_SCHEMA.SSTS_INDEX_META LIMIT 1\G;
107+
*************************** 1. row ***************************
108+
table_dir: data/greptime/public/1814/
109+
index_file_path: data/greptime/public/1814/1814_0000000000/data/index/aba4af59-1247-4bfb-a20b-69242cdd9374.puffin
110+
region_id: 7791070674944
111+
table_id: 1814
112+
region_number: 0
113+
region_group: 0
114+
region_sequence: 0
115+
file_id: aba4af59-1247-4bfb-a20b-69242cdd9374
116+
index_file_size: 838
117+
index_type: bloom_filter
118+
target_type: column
119+
target_key: 2147483652
120+
target_json: {"column":2147483652}
121+
blob_size: 688
122+
meta_json: {"bloom":{"bloom_filter_size":640,"row_count":2242,"rows_per_segment":1024,"segment_count":3}}
123+
node_id: 0
124+
1 row in set (0.02 sec)
125+
```

docs/reference/sql/overview.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ description: GreptimeDB SQL statements.
2121
* [DISTINCT](./distinct.md)
2222
* [DROP](./drop.md)
2323
* [EXPLAIN](./explain.md)
24-
* [Functions](./functions/overview.md)
2524
* [GROUP BY](./group_by.md)
2625
* [HAVING](./having.md)
2726
* [INSERT](./insert.md)
@@ -38,6 +37,9 @@ description: GreptimeDB SQL statements.
3837
* [WHERE](./where.md)
3938
* [WITH](./with.md)
4039

40+
## Functions
41+
* [Functions](./functions/overview.md)
42+
4143
## Information Schema
4244
* [INFORMATION_SCHEMA](./information-schema/overview.md)
4345

i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/information-schema/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,3 +61,4 @@ description: INFORMATION_SCHEMA 提供对系统元数据的访问,例如数据
6161
| [`FLOWS`](./flows.md) | 提供 Flow 相关信息。|
6262
| [`PROCEDURE_INFO`](./procedure-info.md) | 提供 Procedure 相关信息。|
6363
| [`PROCESS_LIST`](./process-list.md) | 提供集群内正在执行的查询信息|
64+
| [`SSTS_INDEX_META`](./ssts-index-meta.md) | 提供 SST 索引元数据,包括倒排索引、全文索引和布隆过滤器。|
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
---
2+
keywords: [SST 索引元数据, Puffin 索引, 倒排索引, 全文索引, 布隆过滤器, 索引元数据]
3+
description: 提供对 SST(排序字符串表)索引元数据的访问,包括以 Puffin 格式存储的倒排索引、全文索引和布隆过滤器的信息。
4+
---
5+
6+
# SSTS_INDEX_META
7+
8+
`SSTS_INDEX_META` 表提供对从清单中收集的 SST(排序字符串表)索引元数据的访问。该表显示 Puffin 索引元数据的信息,包括倒排索引、全文索引和布隆过滤器。
9+
10+
:::tip 注意
11+
此表在 [GreptimeCloud](https://greptime.cloud/) 上不可用。
12+
:::
13+
14+
```sql
15+
USE INFORMATION_SCHEMA;
16+
DESC SSTS_INDEX_META;
17+
```
18+
19+
输出如下:
20+
21+
```sql
22+
+-----------------+--------+-----+------+---------+---------------+
23+
| Column | Type | Key | Null | Default | Semantic Type |
24+
+-----------------+--------+-----+------+---------+---------------+
25+
| table_dir | String | | NO | | FIELD |
26+
| index_file_path | String | | NO | | FIELD |
27+
| region_id | UInt64 | | NO | | FIELD |
28+
| table_id | UInt32 | | NO | | FIELD |
29+
| region_number | UInt32 | | NO | | FIELD |
30+
| region_group | UInt8 | | NO | | FIELD |
31+
| region_sequence | UInt32 | | NO | | FIELD |
32+
| file_id | String | | NO | | FIELD |
33+
| index_file_size | UInt64 | | YES | | FIELD |
34+
| index_type | String | | NO | | FIELD |
35+
| target_type | String | | NO | | FIELD |
36+
| target_key | String | | NO | | FIELD |
37+
| target_json | String | | NO | | FIELD |
38+
| blob_size | UInt64 | | NO | | FIELD |
39+
| meta_json | String | | YES | | FIELD |
40+
| node_id | UInt64 | | YES | | FIELD |
41+
+-----------------+--------+-----+------+---------+---------------+
42+
```
43+
44+
`SSTS_INDEX_META` 表中的字段描述如下:
45+
46+
- `table_dir`:表的目录路径。
47+
- `index_file_path`:Puffin 索引文件的完整路径。
48+
- `region_id`:Region 的 ID。
49+
- `table_id`:表的 ID。
50+
- `region_number`:表中的 Region 编号。
51+
- `region_group`:Region 的组标识符。
52+
- `region_sequence`:Region 的序列号。
53+
- `file_id`:索引文件的唯一标识符(UUID)。
54+
- `index_file_size`:索引文件的大小(字节)。
55+
- `index_type`:索引的类型。可能的值包括:
56+
- `inverted`:用于快速词条查找的倒排索引
57+
- `fulltext_bloom`:全文索引和布隆过滤器的组合索引
58+
- `bloom_filter`:用于快速成员测试的布隆过滤器
59+
- `target_type`:被索引目标的类型。通常是 `column`,表示基于列的索引。
60+
- `target_key`:标识目标的键(例如,列 ID)。
61+
- `target_json`:目标配置的 JSON 表示,例如 `{"column":0}`
62+
- `blob_size`:blob 数据的大小(字节)。
63+
- `meta_json`:包含索引特定信息的 JSON 元数据,例如:
64+
- 对于倒排索引:FST 大小、位图类型、段行数等
65+
- 对于布隆过滤器:布隆过滤器大小、行数、段数
66+
- 对于全文索引:分析器类型、大小写敏感设置
67+
- `node_id`:索引所在数据节点的 ID。
68+
69+
## 示例
70+
71+
查询所有索引元数据:
72+
73+
```sql
74+
SELECT * FROM INFORMATION_SCHEMA.SSTS_INDEX_META;
75+
```
76+
77+
通过与 `TABLES` 表连接查询特定表的索引元数据:
78+
79+
```sql
80+
SELECT s.*
81+
FROM INFORMATION_SCHEMA.SSTS_INDEX_META s
82+
JOIN INFORMATION_SCHEMA.TABLES t ON s.table_id = t.table_id
83+
WHERE t.table_name = 'my_table';
84+
```
85+
86+
仅查询倒排索引元数据:
87+
88+
```sql
89+
SELECT table_dir, index_file_path, index_type, target_json, meta_json
90+
FROM INFORMATION_SCHEMA.SSTS_INDEX_META
91+
WHERE index_type = 'inverted';
92+
```
93+
94+
按索引类型分组查询索引元数据:
95+
96+
```sql
97+
SELECT index_type, COUNT(*) as count, SUM(index_file_size) as total_size
98+
FROM INFORMATION_SCHEMA.SSTS_INDEX_META
99+
GROUP BY index_type;
100+
```
101+
102+
输出样例:
103+
104+
```sql
105+
mysql> SELECT * FROM INFORMATION_SCHEMA.SSTS_INDEX_META LIMIT 1\G;
106+
*************************** 1. row ***************************
107+
table_dir: data/greptime/public/1814/
108+
index_file_path: data/greptime/public/1814/1814_0000000000/data/index/aba4af59-1247-4bfb-a20b-69242cdd9374.puffin
109+
region_id: 7791070674944
110+
table_id: 1814
111+
region_number: 0
112+
region_group: 0
113+
region_sequence: 0
114+
file_id: aba4af59-1247-4bfb-a20b-69242cdd9374
115+
index_file_size: 838
116+
index_type: bloom_filter
117+
target_type: column
118+
target_key: 2147483652
119+
target_json: {"column":2147483652}
120+
blob_size: 688
121+
meta_json: {"bloom":{"bloom_filter_size":640,"row_count":2242,"rows_per_segment":1024,"segment_count":3}}
122+
node_id: 0
123+
1 row in set (0.02 sec)
124+
```

i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/overview.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@ description: GreptimeDB SQL 语句.
2323
* [DISTINCT](./distinct.md)
2424
* [DROP](./drop.md)
2525
* [EXPLAIN](./explain.md)
26-
* [Functions](./functions/overview.md)
2726
* [GROUP BY](./group_by.md)
2827
* [HAVING](./having.md)
2928
* [INSERT](./insert.md)
@@ -40,6 +39,9 @@ description: GreptimeDB SQL 语句.
4039
* [WHERE](./where.md)
4140
* [WITH](./with.md)
4241

42+
## 函数
43+
* [Functions](./functions/overview.md)
44+
4345
## 系统表
4446

4547
* [INFORMATION_SCHEMA](./information-schema/overview.md)

sidebars.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -719,6 +719,7 @@ const sidebars: SidebarsConfig = {
719719
'reference/sql/information-schema/runtime-metrics',
720720
'reference/sql/information-schema/cluster-info',
721721
'reference/sql/information-schema/process-list',
722+
'reference/sql/information-schema/ssts-index-meta',
722723
],
723724
},
724725
],

0 commit comments

Comments
 (0)