[DEPRECATED] Compute Partition
This page describes an SMT that will be removed in future releases. Please use the new SMT Partition Routing. |
By default, when Debezium detects a change in a data collection, it emits a change event to an Apache Kafka topic with a single partition. As described in Customization of Kafka Connect automatic topic creation, you can customize the default configuration to route events to multiple partitions, based on a hash of the primary key.
However, in some cases, you might want Debezium to route events to a specific partition. The compute partition SMT enables you to route events to destination partitions based on a specified column name value. Debezium uses the hash of the specified value to determine the destination partition.
Example: Basic configuration
You configure the compute partition transformation in the Debezium connector’s Kafka Connect configuration. The configuration specifies the following parameters:
-
The data collection column to use to calculate the destination partition.
-
The maximum number of partitions permitted for the data collection.
The SMT only processes events that originate in specified data collections. Events from other data collections are ignored.
To configure a Debezium connector to route events to a specific partition, configure the ComputePartition
SMT in the Kafka Connect configuration for the Debezium connector.
For example, you might add the following configuration in your connector configuration.
...
topic.creation.default.partitions=2
topic.creation.default.replication.factor=1
...
transforms=ComputePartition
transforms.ComputePartition.type=io.debezium.transforms.partitions.ComputePartition
transforms.ComputePartition.partition.data-collections.field.mappings=inventory.products:name,inventory.orders:purchaser
transforms.ComputePartition.partition.data-collections.partition.num.mappings=inventory.products:2,inventory.orders:2
...
The configuration in the preceding example enables partition computation for the products
and orders
data collections.
The configuration specifies that the SMT uses the name
column to compute the partition for the products
data collection.
The number of partitions is set to 2
.
The number of partitions that you specify must match the number of partitions that are specified in the Kafka topic configuration.
Based on the configuration in the example, for the following Products
table, the SMT routes change events for all records that have the field name hammer
to the same partition.
That is, the items with id
values 104
, 105
, and 106
are routed to the same partition.
id |
name |
description |
weight |
101 |
scooter |
Small 2-wheel scooter |
3.14 |
102 |
car battery |
12V car battery |
8.1 |
103 |
12-pack drill bits |
12-pack of drill bits with sizes ranging from #40 to #3 |
0.8 |
104 |
hammer |
12oz carpenter’s hammer |
0.75 |
105 |
hammer |
14oz carpenter’s hammer |
0.875 |
106 |
hammer |
16oz carpenter’s hammer |
1.0 |
107 |
rocks |
box of assorted rocks |
5.3 |
108 |
jacket |
water-resistant black windbreaker |
0.1 |
109 |
spare tire |
24-inch spare tire |
22.2 |
Configuration options
The following table lists the configuration options that you can use with the compute partition SMT.
Property |
Default |
Description |
A comma-separated list of colon-delimited pairs that specify the columns to use for a specific data collection, for example, |
||
A comma-separated list of colon-delimited pairs that specify the number of partitions to use for a specific data collection, for example, |