Outbox Event Router
The outbox pattern is a way to safely and reliably exchange data between multiple (micro) services. An outbox pattern implementation avoids inconsistencies between a service’s internal state (as typically persisted in its database) and state in events consumed by services that need the same data.
To implement the outbox pattern in a Debezium application, configure a Debezium connector to:
-
Capture changes in an outbox table
-
Apply the Debezium outbox event router single message transformation (SMT)
A Debezium connector that is configured to apply the outbox SMT should capture changes that occur in an outbox table only. For more information, see Options for applying the transformation selectively.
A connector can capture changes in more than one outbox table only if each outbox table has the same structure.
See Reliable Microservices Data Exchange With the Outbox Pattern to learn about why the outbox pattern is useful and how it works.
For an example that you can run, see the outbox pattern demo, which is in the Debezium examples repository. It includes an example of how to configure a Debezium connector to run the outbox event router SMT.
The outbox event router SMT is not compatible with the MongoDB connector. MongoDB users can run the MongoDB outbox event router SMT. |
Example outbox message
To understand how the Debezium outbox event router SMT is configured, review the following example of a Debezium outbox message:
# Kafka Topic: outbox.event.order
# Kafka Message key: "1"
# Kafka Message Headers: "id=4d47e190-0402-4048-bc2c-89dd54343cdc"
# Kafka Message Timestamp: 1556890294484
{
"{\"id\": 1, \"lineItems\": [{\"id\": 1, \"item\": \"Debezium in Action\", \"status\": \"ENTERED\", \"quantity\": 2, \"totalPrice\": 39.98}, {\"id\": 2, \"item\": \"Debezium for Dummies\", \"status\": \"ENTERED\", \"quantity\": 1, \"totalPrice\": 29.99}], \"orderDate\": \"2019-01-31T12:13:01\", \"customerId\": 123}"
}
A Debezium connector that is configured to apply the outbox event router SMT generates the above message by transforming a Debezium raw message like this:
# Kafka Message key: "406c07f3-26f0-4eea-a50c-109940064b8f"
# Kafka Message Headers: ""
# Kafka Message Timestamp: 1556890294484
{
"before": null,
"after": {
"id": "406c07f3-26f0-4eea-a50c-109940064b8f",
"aggregateid": "1",
"aggregatetype": "Order",
"payload": "{\"id\": 1, \"lineItems\": [{\"id\": 1, \"item\": \"Debezium in Action\", \"status\": \"ENTERED\", \"quantity\": 2, \"totalPrice\": 39.98}, {\"id\": 2, \"item\": \"Debezium for Dummies\", \"status\": \"ENTERED\", \"quantity\": 1, \"totalPrice\": 29.99}], \"orderDate\": \"2019-01-31T12:13:01\", \"customerId\": 123}",
"timestamp": 1556890294344,
"type": "OrderCreated"
},
"source": {
"version": "3.1.0.Alpha1",
"connector": "postgresql",
"name": "dbserver1-bare",
"db": "orderdb",
"ts_usec": 1556890294448870,
"txId": 584,
"lsn": 24064704,
"schema": "inventory",
"table": "outboxevent",
"snapshot": false,
"last_snapshot_record": null,
"xmin": null
},
"op": "c",
"ts_ms": 1556890294484,
"ts_us": 1556890294484651,
"ts_ns": 1556890294484651402
}
This example of a Debezium outbox message is based on the default outbox event router configuration, which assumes an outbox table structure and event routing based on aggregates. To customize behavior, the outbox event router SMT provides numerous configuration options.
Basic outbox table
To apply the default outbox event router SMT configuration, your outbox table is assumed to have the following columns:
Column | Type | Modifiers
--------------+------------------------+-----------
id | uuid | not null
aggregatetype | character varying(255) | not null
aggregateid | character varying(255) | not null
type | character varying(255) | not null
payload | jsonb |
Column | Effect |
---|---|
|
Contains the unique ID of the event. In an outbox message, this value is a header. You can use this ID, for example, to remove duplicate messages. |
Contains a value that the SMT appends to the name of the topic to which the connector emits an outbox message. The default behavior is that this value replaces the default |
|
|
Contains the event key, which provides an ID for the payload.
The SMT uses this value as the key in the emitted outbox message.
This is important for maintaining correct order in Kafka partitions. |
|
A representation of the outbox change event.
The default structure is JSON.
By default, the Kafka message value is solely comprised of the |
Additional custom columns |
Any additional columns from the outbox table can be added to outbox events either within the payload section or as a message header. |
Basic configuration
To configure a Debezium connector to support the outbox pattern, configure the outbox.EventRouter
SMT.
To obtain the default behavior of the SMT, add it to the connector configuration without specifying any options, as in the following example:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
The connector might emit many types of event messages (for example, heartbeat messages, tombstone messages, or metadata messages about transactions or schema changes). To apply the transformation only to events that originate in the outbox table, define an SMT predicate statement that selectively applies the transformation to those events only.
Options for applying the transformation selectively
In addition to the change event messages that a Debezium connector emits when a database change occurs, the connector also emits other types of messages, including heartbeat messages, and metadata messages about schema changes and transactions. Because the structure of these other messages differs from the structure of the change event messages that the SMT is designed to process, it’s best to configure the connector to selectively apply the SMT, so that it processes only the intended data change messages. You can use one of the following methods to configure the connector to apply the SMT selectively:
-
Use the
route.topic.regex
configuration option for the SMT.
Payload serialization format
The outbox event router SMT supports arbitrary payload formats.
The SMT passes on payload
column values that it reads from the outbox table without modification.
The way that the SMT converts these column values into Kafka message fields depends on how you configure the SMT.
Common payload formats for serializing data are JSON and Avro.
Using JSON as the payload format
The default serialization format for the outbox event router SMT is JSON.
To use this format, the data type of the source column must be JSON (for example, jsonb
in PostgreSQL).
Expanding escaped JSON String as JSON
When a Debezium outbox message represents the payload
as a JSON String, the resulting Kafka message escapes the string as in the following example:
# Kafka Topic: outbox.event.order
# Kafka Message key: "1"
# Kafka Message Headers: "id=4d47e190-0402-4048-bc2c-89dd54343cdc"
# Kafka Message Timestamp: 1556890294484
{
"{\"id\": 1, \"lineItems\": [{\"id\": 1, \"item\": \"Debezium in Action\", \"status\": \"ENTERED\", \"quantity\": 2, \"totalPrice\": 39.98}, {\"id\": 2, \"item\": \"Debezium for Dummies\", \"status\": \"ENTERED\", \"quantity\": 1, \"totalPrice\": 29.99}], \"orderDate\": \"2019-01-31T12:13:01\", \"customerId\": 123}"
}
The outbox event router enables you to expand the message content to "real" JSON, deducing the companion schema from the JSON document. The resulting Kafka message is formatted as in the following example:
# Kafka Topic: outbox.event.order
# Kafka Message key: "1"
# Kafka Message Headers: "id=4d47e190-0402-4048-bc2c-89dd54343cdc"
# Kafka Message Timestamp: 1556890294484
{
"id": 1, "lineItems": [{"id": 1, "item": "Debezium in Action", "status": "ENTERED", "quantity": 2, "totalPrice": 39.98}, {"id": 2, "item": "Debezium for Dummies", "status": "ENTERED", "quantity": 1, "totalPrice": 29.99}], "orderDate": "2019-01-31T12:13:01", "customerId": 123
}
To enable use of the outbox event router transformation, set the table.expand.json.payload
to true, and use the JsonConverter
as shown in the following example:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
transforms.outbox.table.expand.json.payload=true
value.converter=org.apache.kafka.connect.json.JsonConverter
Using Apache Avro as the payload format
Apache Avro is a common framework for serializing data. Using Avro can be beneficial for message format governance and for ensuring that outbox event schemas evolve in a backwards-compatible way.
How a source application produces Avro formatted content for outbox message payloads is out of the scope of this documentation.
One possibility is to leverage the KafkaAvroSerializer
class to serialize GenericRecord
instances.
To ensure that the Kafka message value is the exact Avro binary data,
apply the following configuration to the connector:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
value.converter=io.debezium.converters.BinaryDataConverter
By default, the payload
column value (the Avro data) is the only message value.
When data is stored in Avro format, the column format must be set to a binary data type, such as bytea
in PostgreSQL.
The value converter for the SMT must also be set to BinaryDataConverter
, so that it propagates the binary value of the payload
column as-is into the Kafka message value.
The Debezium connectors may be configured to emit heartbeat, transaction metadata, or schema change events (support varies by connector).
These events cannot be serialized by the BinaryDataConverter
so additional configuration must be provided so the converter knows how to serialize these events.
As an example, the following configuration illustrates using the Apache Kafka JsonConverter
with no schemas:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
value.converter=io.debezium.converters.BinaryDataConverter
value.converter.delegate.converter.type=org.apache.kafka.connect.json.JsonConverter
value.converter.delegate.converter.type.schemas.enable=false
The delegate Converter
implementation is specified by the delegate.converter.type
option.
If any extra configuration options are needed by the converter, they can also be specified, such as the disablement of schemas shown above using schemas.enable=false
.
The following example illustrates how to configure the SMT to use a delegate converter with a Apicurio Registry to convert data into Avro format:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
value.converter=io.debezium.converters.BinaryDataConverter
value.converter.delegate.converter.type=io.apicurio.registry.utils.converter.AvroConverter
value.converter.delegate.converter.apicurio.registry.url=http://apicurio:8080/apis/registry/v2
value.converter.delegate.converter.apicurio.registry.auto-register=true
value.converter.delegate.converter.registry.find-latest=true
Finally, the following example illustrates how to configure the SMT to use a delegate converter with a Confluent Schema Registry to convert data into Avro format:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
value.converter=io.debezium.converters.BinaryDataConverter
value.converter.delegate.converter.type=io.confluent.connect.avro.AvroConverter
value.converter.delegate.converter.type.basic.auth.credentials.source=USER_INFO
value.converter.delegate.converter.type.basic.auth.user.info={CREDENTIALS}
value.converter.delegate.converter.type.schema.registry.url={URL}
In the preceding configuration examples, because the |
Emitting messages with additional fields
Your outbox table might contain columns whose values you want to add to the emitted outbox messages. For example, consider an outbox table that has a value of purchase-order
in the aggregatetype
column and another column, eventType
, whose possible values are order-created
and order-shipped
. Additional fields can be added with the syntax column:placement:alias
.
The allowed values for placement
are:
- header
- envelope
- partition
To emit the eventType
column value in the outbox message header, configure the SMT like this:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
transforms.outbox.table.fields.additional.placement=eventType:header:type
The result will be a header on the Kafka message with type
as its key, and the value of the eventType
column as its value.
To emit the eventType
column value in the outbox message envelope, configure the SMT like this:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
transforms.outbox.table.fields.additional.placement=eventType:envelope:type
To control which partition the outbox message is produced on, configure the SMT like this:
transforms=outbox,...
transforms.outbox.type=io.debezium.transforms.outbox.EventRouter
transforms.outbox.table.fields.additional.placement=partitionColumn:partition
Note that for the partition
placement, adding an alias will have no effect.
Configuration options
The following table describes the options that you can specify for the outbox event router SMT. In the table, the Group column indicates a configuration option classification for Kafka.
Option | Default | Group | Description |
---|---|---|---|
|
Table |
Determines the behavior of the SMT when there is an
All changes in an outbox table are expected to be |
|
|
Table |
Specifies the outbox table column that contains the unique event ID.
This ID will be stored in the emitted event’s headers under the |
|
|
Table |
Specifies the outbox table column that contains the event key. When this column contains a value, the SMT uses that value as the key in the emitted outbox message. This is important for maintaining correct order in Kafka partitions. |
|
Table |
By default, the timestamp in the emitted outbox message is the Debezium event timestamp. To use a different timestamp in outbox messages, set this option to an outbox table column that contains the timestamp that you want to be in emitted outbox messages. |
||
|
Table |
Specifies the outbox table column that contains the event payload. |
|
|
Table |
Specifies whether the JSON expansion of a String payload should be done. If no content found or in case of parsing error, the content is kept "as is". |
|
|
Table |
When enable JSON expansion property
|
|
Table, Envelope |
Specifies one or more outbox table columns that you want to add to outbox message headers or envelopes. Specify a comma-separated list of pairs. In each pair, specify the name of a column and whether you want the value to be in the header or the envelope. Separate the values in the pair with a colon, for example:
To specify an alias for the column, specify a trio with the alias as the third value, for example:
The second value is the placement and it must always be Configuration examples are in emitting additional fields in Debezium outbox messages. |
||
|
Table, Envelope |
Specifies whether this transformation throws an error if a field specified by the |
|
Table, Schema |
When set, this value is used as the schema version as described in the Kafka Connect Schema Javadoc. |
||
|
Router |
Specifies the name of a column in the outbox table. The default behavior is that the value in this column becomes a part of the name of the topic to which the connector emits the outbox messages. An example is in the description of the expected outbox table. |
|
|
Router |
Specifies a regular expression that the outbox SMT applies in the RegexRouter to outbox table records. This regular expression is part of the setting of the |
|
|
Router |
Specifies the name of the topic to which the connector emits outbox messages.
The default topic name is
|
|
|
Router |
Indicates whether an empty or |
|
|
Tracing |
The name of the field containing tracing span context. |
|
|
Tracing |
The operation name representing the Debezium processing span. |
|
|
Tracing |
When |
Distributed tracing
The outbox event routing SMT has support for distributed tracing. See tracing documentation for more details.