You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/clickpipes/postgres/faq.md
+21-11Lines changed: 21 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -203,23 +203,23 @@ For manually created publications, please add any tables you want to the publica
203
203
If you're replicating from a Postgres read replica/hot standby, you will need to create your own publication on the primary instance, which will automatically propagate to the standby. The ClickPipe will not be able to manage the publication in this case as you're unable to create publications on a standby.
- **At Minimum:** Set [`max_slot_wal_keep_size`](https://www.postgresql.org/docs/devel/runtime-config-replication.html#GUC-MAX-SLOT-WAL-KEEP-SIZE) to retain at least **two days' worth** of WAL data.
209
209
-**For Large Databases (High Transaction Volume):** Retain at least **2-3times** the peak WAL generation per day.
210
210
-**For Storage-Constrained Environments:** Tune this conservatively to **avoid disk exhaustion** while ensuring replication stability.
211
211
212
-
### How to calculate the right value {#how-to-calculate-the-right-value}
212
+
#### How to calculate the right value {#how-to-calculate-the-right-value}
213
213
214
214
To determine the right setting, measure the WAL generation rate:
215
215
216
-
#### For PostgreSQL 10+ {#for-postgresql-10}
216
+
##### For PostgreSQL 10+ {#for-postgresql-10}
217
217
218
218
```sql
219
219
SELECT pg_wal_lsn_diff(pg_current_wal_insert_lsn(), '0/0') / 1024 / 1024 AS wal_generated_mb;
220
220
```
221
221
222
-
#### For PostgreSQL 9.6 and below: {#for-postgresql-96-and-below}
222
+
##### For PostgreSQL 9.6 and below: {#for-postgresql-96-and-below}
223
223
224
224
```sql
225
225
SELECT pg_xlog_location_diff(pg_current_xlog_insert_location(), '0/0') / 1024 / 1024 AS wal_generated_mb;
* Multiply that number by 2or3 to provide sufficient retention.
231
231
*Set`max_slot_wal_keep_size` to the resulting value in MB or GB.
232
232
233
-
#### Example {#example}
233
+
##### Example {#example}
234
234
235
235
If your database generates 100 GB of WAL per day, set:
236
236
@@ -257,7 +257,7 @@ The most common cause of replication slot invalidation is a low `max_slot_wal_ke
257
257
258
258
In rare cases, we have seen this issue occur even when `max_slot_wal_keep_size` is not configured. This could be due to an intricate and rare bug in PostgreSQL, although the cause remains unclear.
259
259
260
-
## I am seeing out of memory (OOMs) on ClickHouse while my ClickPipe is ingesting data. Can you help? {#i-am-seeing-out-of-memory-ooms-on-clickhouse-while-my-clickpipe-is-ingesting-data-can-you-help}
260
+
### I am seeing out of memory (OOMs) on ClickHouse while my ClickPipe is ingesting data. Can you help? {#i-am-seeing-out-of-memory-ooms-on-clickhouse-while-my-clickpipe-is-ingesting-data-can-you-help}
261
261
262
262
One common reason for OOMs on ClickHouse is that your service is undersized. This means that your current service configuration doesn't have enough resources (e.g., memory or CPU) to handle the ingestion load effectively. We strongly recommend scaling up the service to meet the demands of your ClickPipe data ingestion.
263
263
@@ -267,15 +267,15 @@ Another reason we've observed is the presence of downstream Materialized Views w
267
267
268
268
- Another optimization for JOINs is to explicitly filter the tables through `subqueries`or`CTEs`and then perform the `JOIN` across these subqueries. This provides the planner with hints on how to efficiently filter rows and perform the `JOIN`.
269
269
270
-
## I am seeing an `invalid snapshot identifier` during the initial load. What should I do? {#i-am-seeing-an-invalid-snapshot-identifier-during-the-initial-load-what-should-i-do}
270
+
### I am seeing an `invalid snapshot identifier` during the initial load. What should I do? {#i-am-seeing-an-invalid-snapshot-identifier-during-the-initial-load-what-should-i-do}
271
271
272
272
The `invalid snapshot identifier` error occurs when there is a connection drop between ClickPipes and your Postgres database. This can happen due to gateway timeouts, database restarts, or other transient issues.
273
273
274
274
It is recommended that you do not carry out any disruptive operations like upgrades or restarts on your Postgres database while Initial Load is in progress and ensure that the network connection to your database is stable.
275
275
276
276
To resolve this issue, you can trigger a resync from the ClickPipes UI. This will restart the initial load process from the beginning.
277
277
278
-
## What happens if I drop a publication in Postgres? {#what-happens-if-i-drop-a-publication-in-postgres}
278
+
### What happens if I drop a publication in Postgres? {#what-happens-if-i-drop-a-publication-in-postgres}
279
279
280
280
Dropping a publication in Postgres will break your ClickPipe connection since the publication is required for the ClickPipe to pull changes from the source. When this happens, you'll typically receive an error alert indicating that the publication no longer exists.
281
281
@@ -296,22 +296,32 @@ FOR TABLE <...>, <...>
296
296
WITH (publish_via_partition_root = true);
297
297
```
298
298
299
-
## What if I am seeing `Unexpected Datatype` errors or `Cannot parse type XX ...` {#what-if-i-am-seeing-unexpected-datatype-errors}
299
+
### What if I am seeing `Unexpected Datatype` errors or `Cannot parse type XX ...` {#what-if-i-am-seeing-unexpected-datatype-errors}
300
300
301
301
This error typically occurs when the source Postgres database has a datatype which cannot be mapped during ingestion.
302
302
For more specific issue, refer to the possibilities below.
303
303
304
-
## `Cannot parse type Decimal(XX, YY), expected non-empty binary data with size equal to or less than ...` {#cannot-parse-type-decimal-expected-non-empty-binary-data-with-size-equal-to-or-less-than}
304
+
### `Cannot parse type Decimal(XX, YY), expected non-empty binary data with size equal to or less than ...` {#cannot-parse-type-decimal-expected-non-empty-binary-data-with-size-equal-to-or-less-than}
305
305
306
306
Postgres `NUMERIC`s have really high precision (up to 131072 digits before the decimalpoint; up to 16383 digits after the decimalpoint) and ClickHouse Decimal type allows maximum of (76 digits, 39 scale).
307
307
The system assumes that _usually_ the size would not get that high and does an optimistic cast for the same as source table can have large number of rows or the row can come in during the CDC phase.
308
308
309
309
The current workaround would be to map the NUMERIC type to string on ClickHouse. To enable this please raise a ticket with the support team and this will be enabled for your ClickPipes.
310
310
311
-
## I'm seeing errors like `invalid memory alloc request size <XXX>` during replication/slot creation {#postgres-invalid-memalloc-bug}
311
+
### I'm seeing errors like `invalid memory alloc request size <XXX>` during replication/slot creation {#postgres-invalid-memalloc-bug}
312
312
313
313
There was a bug introduced in Postgres patch versions 17.5/16.9/15.13/14.18/13.21 due to which certain workloads can cause an exponential increase in memory usage, leading to a memory allocation request >1GB which Postgres considers invalid. This bug [has been fixed](https://github.com/postgres/postgres/commit/d87d07b7ad3b782cb74566cd771ecdb2823adf6a) and will be in the next Postgres patch series (17.6...). Please check with your Postgres provider when this patch version will be available for upgrade. If an upgrade isn't immediately possible, a resync of the pipe will be needed as it hits the error.
314
314
315
+
### I need to maintain a complete historical record in ClickHouse, even when the data is deleted from the source Postgres database. Can I completely ignore DELETE and TRUNCATE operations from Postgres in ClickPipes? {#ignore-delete-truncate}
316
+
317
+
Yes! Before creating your Postgres ClickPipe, create a publication without DELETE operations. For example:
318
+
```sql
319
+
CREATE PUBLICATION <pub_name> FOR TABLES IN SCHEMA <schema_name> WITH (publish = 'insert,update');
320
+
```
321
+
Then when [setting up](https://clickhouse.com/docs/integrations/clickpipes/postgres#configuring-the-replication-settings) your Postgres ClickPipe, make sure this publication name is selected.
322
+
323
+
Note that TRUNCATE operations are ignored by ClickPipes and will not be replicated to ClickHouse.
324
+
315
325
### Why can I not replicate my table which has a dot in it? {#replicate-table-dot}
316
326
PeerDB has a limitation currently where dots in source table identifiers - aka either schema name or table name - is not supported for replication as PeerDB cannot discern, in that case, what is the schema and what is the table as it splits on dot.
317
327
Effort is being made to support input of schema and table separately to get around this limitation.
0 commit comments