|
| 1 | +--- |
| 2 | +title: Change Data Capture |
| 3 | +description: Stream database mutations and drop events to Kafka or local file sinks |
| 4 | +--- |
| 5 | + |
| 6 | +:::note |
| 7 | +**Enterprise Feature**: Change Data Capture requires a Dgraph Enterprise license. See [License](../../admin/enterprise-features/license) for details. |
| 8 | +::: |
| 9 | + |
| 10 | +Change Data Capture (CDC) streams database mutations and drop events to external sinks (Kafka or local files). CDC tracks all `set` and `delete` mutations except those affecting password fields, along with all drop events. Live Loader events are recorded; Bulk Loader events are not. |
| 11 | + |
| 12 | +CDC events are based on Raft log changes. If the sink is unreachable by the Alpha leader, Raft logs expand as events accumulate until the sink becomes available. Enable CDC on all Alpha nodes to avoid interruptions in the event stream. |
| 13 | + |
| 14 | +## Enable CDC with Kafka |
| 15 | + |
| 16 | +Kafka records CDC events under the `dgraph-cdc` topic. Create the topic before events are sent to the broker. |
| 17 | + |
| 18 | +Start Dgraph Alpha with the `--cdc` option: |
| 19 | + |
| 20 | +```bash |
| 21 | +dgraph alpha --cdc "kafka=kafka-hostname:port; sasl-user=tstark; sasl-password=m3Ta11ic" |
| 22 | +``` |
| 23 | + |
| 24 | +For localhost Kafka without SASL authentication: |
| 25 | + |
| 26 | +```bash |
| 27 | +dgraph alpha --cdc "localhost:9092" |
| 28 | +``` |
| 29 | + |
| 30 | +For TLS-enabled Kafka clusters, the `ca-cert` option is required. The certificate can be self-signed. |
| 31 | + |
| 32 | +## Enable CDC with File Sink |
| 33 | + |
| 34 | +To stream CDC events to a local unencrypted file, start Dgraph Alpha with: |
| 35 | + |
| 36 | +```bash |
| 37 | +dgraph alpha --cdc "file=local-file-path" |
| 38 | +``` |
| 39 | + |
| 40 | +## Command Reference |
| 41 | + |
| 42 | +The `--cdc` option supports the following sub-options: |
| 43 | + |
| 44 | +| Sub-option | Example `dgraph alpha` command option | Notes | |
| 45 | +|------------------|-------------------------------------------|----------------------------------------------------------------------| |
| 46 | +| `tls` | `--tls=false` | boolean flag to enable/disable TLS while connecting to Kafka. | |
| 47 | +| `ca-cert` | `--cdc "ca-cert=/cert-dir/ca.crt"` | Path and filename of the CA root certificate used for TLS encryption, if not specified, Dgraph uses system certs if `tls=true` | |
| 48 | +| `client-cert` | `--cdc "client-cert=/c-certs/client.crt"` | Path and filename of the client certificate used for TLS encryption | |
| 49 | +| `client-key` | `--cdc "client-cert=/c-certs/client.key"` | Path and filename of the client certificate private key | |
| 50 | +| `file` | `--cdc "file=/sink-dir/cdc-file"` | Path and filename of a local file sink (alternative to Kafka sink) | |
| 51 | +| `kafka` | `--cdc "kafka=kafka-hostname; sasl-user=tstark; sasl-password=m3Ta11ic"` | Hostname(s) of the Kafka hosts. May require authentication using the `sasl-user` and `sasl-password` sub-options | |
| 52 | +| `sasl-user` | `--cdc "kafka=kafka-hostname; sasl-user=tstark; sasl-password=m3Ta11ic"` | SASL username for Kafka. Requires the `kafka` and `sasl-password` sub-options | |
| 53 | +| `sasl-password` | `--cdc "kafka=kafka-hostname; sasl-user=tstark; sasl-password=m3Ta11ic"` | SASL password for Kafka. Requires the `kafka` and `sasl-username` sub-options | |
| 54 | +| `sasl-mechanism` | `--cdc "kafka=kafka-hostname; sasl-mechanism=PLAIN"` | The SASL mechanism for Kafka (PLAIN, SCRAM-SHA-256 or SCRAM-SHA-512). The default is PLAIN | |
| 55 | + |
| 56 | +## Data Format |
| 57 | + |
| 58 | +CDC events are in JSON format. Example: |
| 59 | + |
| 60 | +```json |
| 61 | +{ "key": "0", "value": {"meta":{"commit_ts":5},"type":"mutation","event":{"operation":"set","uid":2,"attr":"counter.val","value":1,"value_type":"int"}}} |
| 62 | +``` |
| 63 | + |
| 64 | +The `meta.commit_ts` value increases with each CDC event. Use this value to identify duplicate events that may occur due to Raft leadership changes. |
| 65 | + |
| 66 | +### Mutation Events |
| 67 | + |
| 68 | +**Set mutation:** |
| 69 | + |
| 70 | +```json |
| 71 | +{"meta":{"commit_ts":29},"type":"mutation","event":{"operation":"set","uid":3,"attr":"counter.val","value":10,"value_type":"int"}} |
| 72 | +``` |
| 73 | + |
| 74 | +**Delete mutation:** |
| 75 | + |
| 76 | +```json |
| 77 | +{"meta":{"commit_ts":44},"type":"mutation","event":{"operation":"del","uid":7,"attr":"Author.name","value":"_STAR_ALL","value_type":"default"}} |
| 78 | +``` |
| 79 | + |
| 80 | +### Drop Events |
| 81 | + |
| 82 | +**Drop all:** |
| 83 | + |
| 84 | +```json |
| 85 | +{"meta":{"commit_ts":13},"type":"drop","event":{"operation":"all"}} |
| 86 | +``` |
| 87 | + |
| 88 | +The `operation` field specifies the drop operation: `attribute`, `type`, `data`, or `all`. |
| 89 | + |
| 90 | +## Multi-Tenancy |
| 91 | + |
| 92 | +In a [multi-tenant environment](../../admin/enterprise-features/multitenancy), CDC events streamed to Kafka are distributed across Kafka partitions by the Kafka client based on the multi-tenancy namespace. |
| 93 | + |
| 94 | +## Limitations |
| 95 | + |
| 96 | +- CDC events track only new values, not old values updated or removed by mutations or drop operations |
| 97 | +- Schema updates are not tracked |
| 98 | +- CDC can only be configured when starting Alpha nodes with the `dgraph alpha` command |
| 99 | +- Node crashes or Raft leadership changes may result in duplicate events, but no data loss |
0 commit comments