
We are excited to announce a candidate release for Debezium 3.1, 3.1.0.CR1.
This new release includes several improvements with the JDBC sink and MySQL connectors, support for ISO string temporal values and Keyspace heartbeats with Vitess, key-based routing for RabbitMQ, and more. Let’s dive in and take a look at these new features and improvements.
Breaking changes
With any new release of software, there is often several breaking changes. The Debezium 3.1.0.CR1 release is no exception, so let’s discuss the major changes you should know about.
Query timeout now applies to Oracle LogMiner queries
When the Oracle connector executes its initial query to fetch data from LogMiner, database.query.timeout.ms
connector configuration property will control the duration of the query before the query is cancelled (DBZ-8830). When upgrading, check the connector metric MaxDurationOfFetchQueryInMilliseconds
to determine whether this new property may need adjustments. By default, the timeout is 10 minutes, but can be disabled when set to 0.
New features and improvements
The upgrade to Debezium 3.1.0.CR1 introduces several new features and improvements in several components:
Core: Centralize logging of sensitive data
We understand that databases house all sorts of information, and that some columns may contain sensitive information. We take pride in making sure that information remains safe and secure. For this reason, we generally prefer to avoid logging sensitive information at INFO, WARN, or ERROR levels.
However, there were some potential corner cases where sensitive column values may be logged at DEBUG or TRACE levels. We added the io.debezium.util.Loggings
class several versions ago to centralize this, but not all instances were using this Loggings
class (DBZ-8525).
By default, users will notice that the Loggings
class records the sensitive information in the logs rather than it included in the original logger in the proceeding log entry. If you prefer to omit the sensitive information, logging configuration can be used to uniquely set a logging level specific to io.debezium.util.Loggings
.
For example, if you need to provide your logs to someone but want the sensitive information omitted, the following configuration can achieve that goal.
log4j.logger.io.debezium=TRACE,stdout
log4j.logger.io.debezium.util.Loggings=ERROR,stdout
This configuration will omit all sensitive information while logging all non-sensitive information at TRACE level.
JDBC: Improved performance
We received several community reports that during peak volume, some databases were experiencing unusually high CPU utilization. After investigation, we identified that several SQL queries were performed too frequently, causing the high CPU and reducing connector write throughput (DBZ-8570). Users should now find that the JDBC sink’s write throughput is higher and the CPU utilization should be more reasonable than before.
JDBC: Automatic retries on connection errors
For a Kafka Connect producer, if a connector throws a RetriableException
and Kafka Connect is configured to support retries on errors, the runtime will automatically stop and restart the connector. This provides a useful way to handle the tearing down of resources and recreating those resources, such as database connections.
But for a Kafka Connect consumer (sink), the lifecycle of the connector works differently. When the connector throws an error, the lifecycle doesn’t stop and restart the connector, but instead calls the put
method again. This can be problematic in the case of certain connection errors because specific resources are not automatically recreated.
Starting with Debezium 3.1, a new JDBC sink connector property connection.restart.on.errors
will allow the JDBC sink to retry connection failures (DBZ-8727).
JDBC: Handle BYTES as VARBINARY for SQL Server targets
A new JDBC sink mapping has been added for converting a Kafka BYTES
field to VARBINARY
column data types (DBZ-8790). This allows source connectors that serialize unknown or other binary data as a Kafka BYTES
field to me correctly mapped to a SQL Server target with the VARBINARY
column data type.
MySQL: Improved error handling for duplicate server id/uuid
For most connectors, Debezium adopts the philosophy to retry all SQLException
or IOException
related failures. This strategy has been quite useful, allowing users to utilize the runtime retry mechanism as needed.
However for MySQL, this presents a unique corner case when there are conflicts with the configured server id/uuid. MySQL uses the server id/uuid to uniquely identify an instance on the cluster topology. If more than one server uses the same id/uuid, the instance will throw a SQLException
and enter a retry/backoff loop on startup.
With Debezium 3.1, the error handling prefers a fail-fast approach for this specific unique case (DBZ-8786). If you are a MySQL user and notice your connectors are entering a FAILED status more frequently, we recommend checking if this use case applies to you. If it does, you should guarantee that your configuration always uses a unique server id/uuid value.
Vitess: Keyspace heartbeat support
Starting in Vitess v21, a new binlog watermarking strategy was introduced for VStream. This new feature sends a "heartbeat" -like event that represents the shard’s binlog events up to the provided timestamp have been received by the VStream client.
A new configuration option vitess.stream.keyspace.heartbeats
can be set to true
to include the heartbeat events written to the keyspace heartbeat tables (DBZ-8775). The table.include.list
should also include the heartbeat table, using the format <keyspace>.heartbeat
.
Vitess: Support ISO string mode temporal precision mode
Debezium Server: Key routing support for RabbitMQ
In Debezium 3.1, we have changed how you can route events using configuration. This new approach uses a strategy-based design, that retains old behaviors and introduces the new key-based routing mechanism (DBZ-8752).
First and foremost, the rabbitmq.routingKeyFromTopicName
is deprecated and will be removed in a future release. This functionality has been folded into the new rabbitmq.routingKey.source
configuration property, and it can be set one to one of the following values:
static
-
When using the static routing source, the RabbitMQ sink will use the
rabbitmq.routingKey
static value you have specified in the sink’s configuration. As this value is set in the configuration and read only during the sink startup, the value is static and does not change over the runtime of the sink. topic
-
When using the topic routing source, the RabbitMQ sink will source the routing key based on the destination topic name. This mode replaces the old
rabbitmq.routingKeyFromTopicName
configuration property behavior, which is now deprecated. key
-
When using the new key routing source, the RabbitMQ sink will source the routing key based on the event’s record key. This provides the flexibility to control the routing mechanism for RabbitMQ to use the raw Debezium change event’s key or by using a custom transformation to change the event’s key in-flight before sending the event to RabbitMQ.
Examples: Debezium optimized for GraalVM
Change Data Capture (CDC) is widely used in various contexts, such as microservices communication, legacy system modernization, and cache invalidation. The core idea of this pattern is to detect and track changes in a data source (e.g., a database) and propagate them to other systems in real-time or near real-time. Debezium is a CDC platform that provides a wide range of connectors for most data sources. Beyond capturing changes, it also offers transformation capabilities through an intuitive UI for defining debezium instances.
Check out our recent blog Superfast Debezium which walks you through the latest example of using Debezium with GraalVM!
Other changes
The following are some noteworthy changes in 3.1.0.CR1:
-
The first cdc message always lost when using debezium engine to capture oracle data DBZ-8141
-
Update format-maven-plugin to 2.26.0 DBZ-8695
-
Centralize helm chart repo DBZ-8707
-
OTEL libs are not loaded to Docker image DBZ-8767
-
Change the documentation of minimum Java version requirement from 11 to 21 DBZ-8771
-
Add delete.tombstone.handling.mode to ConfigDef returned by config method and change its display name DBZ-8776
-
Signal Channel Kafka restart snapshot multiple snapshot after connector restart DBZ-8780
-
Update Debezium platform images in values.yaml DBZ-8781
-
Allow Debezium server to use Kafka Connect format for the records DBZ-8782
-
Sources and home in debezium platform helm chart points to old repo DBZ-8784
-
Write README for debezium-chart repo DBZ-8785
-
Remove Helm from Debezium operator manifest README DBZ-8791
-
Write blog post about the recent changes on charts.debezium.io DBZ-8792
-
DebeziumServerPostgresIT randomly fails DBZ-8821
-
Test keyspace heartbeats during snapshot DBZ-8824
-
Make methods for adding fields into the record reuseable DBZ-8825
-
Enable build of debezium platform images DBZ-8829
-
Unexpected null value for Field Configuration deprecated aliases DBZ-8832
In total, 30 issues were resolved in Debezium 3.1.0.CR1. The list of changes can also be found in our release notes.
A big thank you to all the contributors from the community who worked diligently on this release: Bhagyashree Goyal, Chris Cranford, Giovanni Panice, Jiri Pechanec, Katsumi Miyajima, Mario Fiore Vitale, Minjae Lee, Rajendra Dangwal, Robert Roldan, Thomas Thornton, Victor Castaño, Vojtech Juranek, Yuriy Vikulov, and Zakariae Ben Allal!
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.