Debezium 2.6.0.Final Released

Breaking changes

While we try to avoid any potential breaking changes between minor releases, such changes are sometimes inevitable. The upgrade to Debezium 2.6 includes a total of 7 unique breaking changes:

MySQL

The MysQL driver was updated to version 8.3.0, and this driver is not compatible with MySQL 5.x. If you still need to use an older MySQL version, please downgrade the driver after installation to a version that is compatible with your database (DBZ-7652).

MongoDB

The MongoDB connector no longer supports the replica_set mode (DBZ-7260). This has been a feature that has been deprecated for several versions and there has been ongoing work over Debezium 2.x to achieve this goal. If you are using the replica_set mode, you will need to make adjustments when using Debezium 2.6+.

SQL Server

The SQL Server connector was not capturing all schemas when the connector was first deployed, and instead, was only capturing the schemas based on the tables defined in the configuration’s include list. This was a bug that could prevent users from easily adding new tables to the connector when expecting that the new table’s schema would already exist in the schema history topic. The connector now correctly honors the store.only.captured.tables.ddl configuration option (DBZ-7593).

For existing connector deployments, if you do not specifically set the store.only.captured.tables.ddl property for the schema history topic, the connector will begin capturing schema changes for all relevant tables in your database. If you want to prevent this and retain the prior behavior, you will need to adjust your connector configuration by adding schema.history.internal.store.only.captured.tables.ddl with a value of true.

Oracle

In older versions of Debezium, users were required to manually install the ojdbc8.jar JDBC driver. With 2.6, the connector now bundles the Oracle JDBC driver with the connector, so manual installation is no longer necessary (DBZ-7364).

We’ve also updated the driver to version 21.11.0.0, please verify that you do not have multiple versions after upgrading to Debezium 2.6 (DBZ-7365).

Vitess

The task configuration format used by previous versions of the connector could de-stabilize the Kafka Connect cluster. To resolve the problem, Debezium 2.6 introduces a new configuration format that is incompatible with the previous format (DBZ-7250). When upgrading, you may experience a NullPointerException and the error indicating that the connector was unable to instantiate a task because it contains an invalid task configuration.

If you experience this problem, delete and re-create the connector, using the same name and configuration as before. The connector(s) will start and re-use the offsets last stored by using the same name, but will not re-use the old task configurations, avoiding the start-up failure.
The Vitess connector previously used the timestamp of BEGIN message as the source timestamp. This has been changed to the usage of the COMMIT timestamp to reflect the behaviour of other connectors (DBZ-7628).

Container Images

The handling of the MAVEN_DEP_DESTINATION environment variable has changed in the connect-base container image, which is the basis for debezium/connect. It is no longer used for downloading all dependencies, including connectors, but only for general purpose Maven Central located dependencies (DBZ-7551). If you were using custom images that relied on this environment variable, your image build steps may require modifications.

New features and improvements

Debezium 2.6 also introduces many improvements and features, lets take a look at each individually.

Db2 for iSeries connector

Debezium 2.6 introduces a brand-new connector for IBM fans to stream changes from Db2 iSeries/AS400 using the IBM iJournal system. This collaboration is a multi-year development effort from the community, and we’re pleased that the community has allowed this to be distributed under the Debezium umbrella.

The new connector can be obtained from Maven Central using the following coordinates or a direct download.

<dependency>
    <groupId>io.debezium</groupId>
    <artifactId>debezium-connector-ibmi</artifactId>
    <version>2.6.0.Beta1</version>
</dependency>

The documentation for this new connector is still a work-in-progress. If you have any questions, please be sure to reach out to the team on Zulip or the mailing list.

Java 17 now compile-time requirement

Debezium 3.0 which will debut later this fall will once again shift the Java baseline requirement from Java 11 to 17 to use Debezium. In preparation for Debezium 3 later this year, we are making the shift to a compile-time baseline for Debezium 2.6 and 2.7 to require Java 17 (DBZ-7387).

If you are a Debezium user, and you consume Debezium connectors, this will require no action on your part. You can continue to use Java 11 for now without issue, understanding that Debezium 3 will require Java 17 later this year.

If you are developing Debezium connectors, Java 17 is now baseline to compile the Debezium source. If you have been using Java 17, there should be no action taken on your part. If you previously were using Java 11, you will need to move to Java 17 in order to compile from source.

If you are using the Debezium Quarkus Outbox Extension (not the Outbox SMT), as Quarkus 3.7+ is making the move to Java 17 as their baseline, the Debezium Quarkus Outbox Extension will now require Java 17 as a baseline for both runtime and compile time.

We expect this transition to be mostly seamless for most users as this should have absolutely no impact on the runtime of Debezium’s connectors nor Debezium Server at this time.

Asynchronous Embedded Engine

If you’re hearing about the Embedded Engine for the first time, Debezium ships with three ways to run Debezium connectors. The most common is to deploy Debezium on Kafka Connect while the second most common is to use Debezium Server, a read-made runtime for Debezium connectors. However, there is a third option called the Embedded Engine, and it is what Debezium uses internally for its test suite, it’s the foundation for Debezium Server, and it’s meant to provide a way to embed Debezium connectors inside your own application. The embedded engine is used by a variety of external contributors and frameworks, most notably Apache Flink heavily relies on the embedded engine for their Debezium based CDC connectors.

One of the biggest and major new features of Debezium 2.6 is the work on the asynchronous embedded engine that we are debuting in this alpha release. This new asynchronous version the foundation for which Debezium Server and the future of embedding Debezium is based. This change focuses on several key goals and initiatives:

Run multiple source tasks for a given connector, if the connector supports multiple tasks
Run time-consuming code (transformations or serialization) in dedicated threads
Allow additional performance by disabling event dispatch order
Provide future technology benefits of things such as virtual threads and delegating to external workers
Better integration with Debezium Operator for Kubernetes and Debezium UI
Seamlessly integrate with Quarkus for Debezium Server

What this new asynchronous model does not include or focus on are the following:

Implement parallelization inside a connector’s main capture loop.
Remove any dependency from Kafka Connect
Add support for multiple source connectors per Engine deployment
Add support for sink connectors

Even if a connector is single-threaded and does not support multiple tasks, a connector deployment using the Embedded Engine or Debezium Server can take advantage of the new asynchronous model. A large portion of time during even dispatch is spent on transformation and serialization phases, so utilizing the new dedicated worker threads for such stages improves throughput.

For developers who want to get started with the new asynchronous embedded engine, a new package is now included in the debezium-embedded artifact called io.debezium.embedded.async and this package contains all the pertinent components to utilizing this new implementation. The asynchronous model can be constructed in a similar way to the serial version using the builder pattern, shown below.

final DebeziumEngine engine = new AsyncEngine.AsyncEngineBuilder()
    .using(properties)
    .notifying(this::changeConsumerHandler)
    .build();

We encourage everyone to take a look at the new Asynchronous Embedded Engine model, let us know your thoughts and if you spot any bugs or problems. We will be updating the documentation in coming releases to highlight all the benefits and changes, including examples. Until then, you can find all the details in the design document, DDD-7.

New Unified Snapshot Modes

The snapshot process is an integral part of each connector’s lifecycle, and it’s responsible for collecting and sending all the historical data that exists in your data store to your target systems, if desired. For Debezium users who work with multiple connector types, we understand that having differing snapshot modes across connectors can sometimes be confusing to work with. So this change is designed to address that.

For many of you who may have already tried or installed Debezium 2.6 pre-releases, you’re already using the unified snapshot SPI as it was designed to be a drop-in-replacement initially, requiring no changes. This release finishes that work for MongoDB and DB2.

Of these changes, the most notable include the following:

All snapshot modes are available to all connectors, excluding never which remains specific to MySQL. This means that connectors that may have previously not supported a snapshot mode, such as when_needed, can now use this mode to retake a snapshot when the connector identifies that its necessary.
The schema_only_recovery mode has been deprecated and replaced by recovery.
The schema_only mode has also been deprecated and replaced by no_data.

All deprecated modes will remain available until Debezium 3 later this year. This provides users with about six months to adjust scripts, configurations, and processes in advance.

New Matching Collections API added

One of the team’s ongoing tasks include the migration of Debezium UI’s backend into the main Debezium repository. One of the unique benefits with doing this is we can identify where there is code overlap between a connector’s runtime and the UI, and develop interface contracts to expose this shared data.

Thanks to a community contribution for DBZ-7167, the RelationalBaseSourceConnector contract has been adjusted and a new method introduced to return a list of table names that match the connector’s specific configuration. Any connector that implements this abstract base class will need to implement this new method.

Source transaction id changes

All Debezium change events contain a special metadata block called the source information block. This part of the event payload is responsible for providing metadata about the change event, including the unique identifier of the change, the time the change happened, the database and table the change is in reference to, as well as transaction metadata about the transaction that the change participated in.

In Debezium 2.6, the transaction_id field in the source information block will no longer be provided unless the field is populated with a value. This should present no issue for users as this field was only populated when the connector was configured with provide.transaction.metadata set to true (DBZ-7380).

If you have tooling that expects the existence of the source information block’s transaction_id field although its optional, you will need to adjust that behavior as the field will no longer be present unless populated.

Improved event timestamp precision

Debezium 2.6 introduces a new community requested feature to improve the precision of timestamps in change events. Users will now notice the addition of 4 new fields, two at the envelope level and two in the source information block as shown below:

{
  "source": {
    ...,
    "ts_us": "1559033904863123",
    "ts_ns": "1559033904863123000"
  },
  "ts_us": "1580390884335451",
  "ts_ns": "1580390884335451325",
}

The envelope values will always provide both microsecond (ts_us) and nanosecond (ts_ns) values while the source information block may have both micro and nano -second precision values truncated to a lower precision if the source database does not provide that level of precision.

Scoped Key/Trust - store support with MongoDB

Debezium supports secure connections; however, MongoDB requires that the key/trust -store configurations be supplied as JVM process arguments, which is less than ideal for environments like the cloud. As a first step toward aligning how secure connection configuration is specified across our connectors, Debezium 2.6 for MongoDB now supports specifying scoped key/trust -store configurations in the connector configuration (DBZ-7379).

The MongoDB connector now includes the following new configuration properties:

mongodb.ssl.keystore: Specifies the path to the SSL keystore file.
mongodb.ssl.keystore.password: Specifies the credentials to open and access the SSL keystore provided by mongodb.ssl.keystore.
mongodb.ssl.keystore.type: Specifies the SSL keystore file type, defaults to PKC512.
mongodb.ssl.truststore: Specifies the path to the SSL truststore file.
mongodb.ssl.truststore.password: Specifies the credentials to open and access the SSL truststore provided by mongodb.ssl.truststore.
mongodb.ssl.truststore.type: Specifies the SSL truststore file type, defaults to PKC512.

MongoDB UUID key support for Incremental snapshots

As a small improvement to the Incremental Snapshot process for the Debezium for MongoDB connector, Debezium 2.6 adds support for the UUID data type, allowing this data type to be used within the Incremental Snapshot process like other data types (DBZ-7451).

MongoDB post-image changes

The MongoDB connector’s event payload can be configured to include the full document that was changed in an update. The connector previously made an opinionated choice about how the full document would be fetched as part of the change stream; however, this behavior was not consistent with our expectations in all use cases.

Debezium 2.6 introduces a new configuration option, capture.mode.full.update.type, allowing the connector to explicitly control how the change stream’s full document lookup should be handled (DBZ-7299). The default value for this option is lookup, meaning that the database will make a separate look-up to fetch the full document. If you are working with MongoDB 6+, you can also elect to use post_image to rely on MongoDB change stream’s post-image support.

Incremental snapshot row-value constructors for PostgreSQL

The PostgreSQL driver supports a SQL syntax called a row-value constructor using the ROW() function. This allows a query to express predicate conditions in a more efficient way when working with multi-columned primary keys that have a suitable index. The incremental snapshot process is an ideal candidate for the use of the ROW() function, the process involves issuing a series of select SQL statements to fetch data in chunks. Each statement, aka chunk query, should ideally be as efficient as possible to minimize the cost overhead of these queries to maximize throughput of your WAL changes to your topics.

There are no specific changes needed, but the query issued for PostgreSQL incremental snapshots has been adjusted to take advantage of this new syntax, and therefore users who utilize incremental snapshots should see performance improvements.

An example of the old query used might look like this for a simple table:

SELECT *
  FROM users
 WHERE (a = 10 AND (b > 2 OR b IS NULL)) OR (a > 10) OR (a IS NULL)
 ORDER BY a, b LIMIT 1024

The new implementation constructs this query using the ROW() function as follows:

SELECT *
  FROM users
 WHERE row(a,b) > row(10,2)
ORDER BY a, b LIMIT 1024

We’d be interested in any feedback on this change, and what performance improvements are observed.

SQL Server query improvements

The Debezium SQL Server utilizes a common SQL Server stored procedure called fn_cdc_get_all_changes… to fetch all the relevant captured changes for a given table. This query performs several unions and only ever returns data from one of the union sub-queries, which can be inefficient.

Debezium 2.6 for SQL Server introduces a new configuration property data.query.mode that can be used to influence which specific method the connector will use to gather the details about table changes (DBZ-7273). The default remains unchanged from older releases, using the value function to delegate to the above aforementioned stored procedure. A new option, called direct, can be used instead to build the query directly within the connector to gather the changes more efficiently.

Oracle Infinispan cache improvements

The Debezium Oracle connector maintains a buffer of all in-flight transactions, and this buffer can be allocated off-heap using Infinispan. Sometimes, the user configuration specifies that if an in-flight transaction lasts longer than the specified number of milliseconds, the transaction can be abandoned or discarded by the buffer. This means that the transaction will be forgotten and not emitted by the connector.

In order to improve metrics integration with frameworks like Grafana and Prometheus, a new JMX metric, AbandonedTransactionCount, was added to track the number of transactions that are abandoned by the connector during it’s runtime.

Oracle Redo SQL per event with LogMiner

We have improved the Oracle connector’s event structure for inserts, updates, and deletes to optionally contain the SQL that was reconstructed by LogMiner in the source information block. This feature is an opt-in only feature that you must enable as this can easily more than double the size of your existing event payload.

To enable the inclusion of the REDO SQL as part of the change event, add the following connector configuration:

"log.mining.include.redo.sql": "true"

With this option enabled, the source information block contains a new field redo_sql, as shown below:

"source": {
  ...
  "redo_sql": "INSERT INTO \"DEBEZIUM\".\"TEST\" (\"ID\",\"DATA\") values ('1', 'Test');"
}

This feature cannot be used with lob.enabled set to true due to how LogMiner reconstructs the SQL related to CLOB, BLOB, and XML data types. If the above configuration is added with lob.enabled set to true, the connector will start with an error about this misconfiguration.

Oracle LogMiner transaction buffer improvements

A new delay-strategy for transaction registration has been added when using LogMiner. This strategy effectively delays the creation of the transaction record in the buffer until we observe the first captured change for that transaction.

For users who use the Infinispan cache or who have enabled lob.enabled, this delayed strategy cannot be used due to how specific operations are handled in these two modes of the connector.

Delaying transaction registration has a number of benefits, which include:

Reducing the overhead on the transaction cache, especially in a highly concurrent transaction scenario.
Avoids long-running transactions that have no changes that are being captured by the connector.
Should aid in advancing the low-watermark SCN in the offsets more efficiently in specific scenarios.

We are looking into how we can explore this change for Infinispan-based users in a future build; however, due to the nature of how lob.enabled works with LogMiner, this feature won’t be possible for that use case.

Oracle LogMiner Hybrid Mining Strategy

Debezium 2.6 also introduces a new Oracle LogMiner mining strategy called hyrid, which can be enabled by setting the configuration property log.mining.strategy with the value of hybrid. This new strategy is designed to support all schema evolution features of the default mining strategy while taking advantage of all the performance optimizations from the online catalog strategy.

The main problem with the online_catalog strategy is that if a mining step observes a schema change and a data change in the same mining step, LogMiner is incapable of reconstructing the SQL correctly, which will result in the table name being OBJ# xxxxxx or the columns represented as COL1, COL2, and so on. To avoid this while using the online catalog strategy, users are recommended to perform schema changes in a lock-step pattern to avoid a mining step that observes both a schema change and a data change together; however, this is not always feasible.

The new hybrid strategy works by tracking a table’s object id at the database level and then using this identifier to look up the schema associated with the table from Debezium’s relational table model. In short, this allows Debezium to do what Oracle LogMiner is unable to do in these specific corner cases. The table name will be taken from the relational model’s table name and columns will be mapped by column position.

Unfortunately, Oracle does not provide a way to reconstruct failed SQL operations for CLOB, BLOB, and XML data types. This means that the new hybrid strategy cannot be configured with configurations that use lob.enabled set to true. If a connector is started using the hybrid strategy and has lob.enabled set to true, the connector will fail to start and report a configuration failure.

XML Support for OpenLogReplicator

The Debezium for Oracle connector supports connections with OpenLogReplicator, allowing Oracle users to stream changes directly from the transaction logs. The latest build of OpenLogReplicator, version 1.5.0 has added support for XML column types.

To get started streaming XML with OpenLogReplicator, please upgrade the OpenLogReplicator process to 1.5.0 and restart the replicator process. Be aware that if you want to stream binary-based XML column data, you will need to toggle this feature as enabled in the OpenLogReplicator configuration.

Informix appends LSN to Transaction Identifier

Informix databases only increases the transaction identifier when there are concurrent transactions, otherwise the value remains identical for sequential transactions. This can prove difficult for users who may want to utilize the transaction metadata to order change events in a post processing step.

Debezium 2.6 for Informix will now append the log sequence number (LSN) to the transaction identifier so that users can easily sort change events based on the transaction metadata. The transaction identifier field will now use the format <id>:<lsn>. This change affects transaction metadata events and the source information block for change events, as shown below:

Transaction Begin Event

{
  "status": "BEGIN",
  "id": "571:53195829",
  ...
}

Transaction End Event

{
  "status": "END",
  "id": "571:53195832",
  ...
}

Change Events

{
  ...
  "source": {
    "id": "571:53195832"
    ...
  }
}

Supports Spanner `NEW_ROW_AND_OLD_VALUES` value capture type

Google Spanner’s value capture type is responsible for controlling how the change stream represents the change data in the event stream and are configured when constructing the change stream.

Spanner introduced a new value capture mode called NEW_ROW_AND_OLD_VALUES, which is responsible for capturing all values of tracked columns, both modified and unmodified, whenever any column changes. This new mode is an improvement over NEW_ROW because it also includes the capture of old values, making it align with what you typically observe with other Debezium connectors.

New Arbitrary-based payload formats

While it’s common for users to utilize serialization based on Json, Avro, Protobufs, or CloudEvents, there may be reasons to use a more simplistic format. Thanks to a community contribution as part of DBZ-7512, Debezium can be configured to use two new formats called simplestring and binary.

The simplestring and binary formats are configured in Debezium server using the debezium.format configurations. For simplestring, the payload will be serialized as a single STRING data type into the topic. For binary, the payload will be serialized as a BYTES using a byte[] (byte array).

TRACE level logging for Debezium Server

Debezium Server is a ready-made runtime for Debezium source connectors that uses the Quarkus framework to manage the source and sink deployments. As most Debezium Server users are aware who have reached out with questions or bugs, we often ask for TRACE-level logs and this has often proven difficult as it requires a full rebuild of Debezium Server due to how minimum logging level is a build-time configuration in Quarkus.

With Debezium 2.6.0.CR1 release and later, this will no longer be required. The build time configuration has been adjusted by default to include TRACE logging levels, so moving forward users can simply set the log level to TRACE and restart Debezium Server to obtain the logs (DBZ-7369).

Google PubSub Ordering Key Support

The Debezium Server Google PubSub sink adapter has received a small update in Debezium 2.6. If you are streaming changes that have foreign key relationships, you may have wondered whether it was possible to specify an ordering key so that foreign key constraints could be maintained.

Debezium 2.6 introduces a new configurable property for the Google PubSub sink adapter, ordering.key, which allows the sink adapter to use an externally provided ordering key from the connector configuration for the events rather than using the default behavior based on the event’s key (DBZ-7435).

CloudEvents schema name customization

When using schema registry, event schemas need to be registered with a name so that they can be looked up upon later inquiries by pipelines. So when pairing CloudEvents formatted messages with schema registry, the same applies and in Debezium 2.6, you can explicitly control how the name is registered.

By default, the schema for a CloudEvent message will be automatically generated by the converter. However, if the auto generated schema names are not sufficient, you can adjust the configuration by specifying dataSchemaName, which can be set either to generate (the default behavior) or header to pull the schema name directly from the specified event header field.

Timestamp converter improvements

Debezium released the new TimezoneConverter in Debezium 2.4, allowing users to target a specific time zone and to convert the outgoing payload time values to that targeted time zone. The original implementation was specifically restricted to allow conversion of values within the before or after parts of the payload; however, thanks to an improvement as a part of DBZ-7022, the converter can now be used to convert other time-based fields in the metadata, such as ts_ms in the source information block.

This change helps to improve lag metric calculations in situations where the JVM running the connector is using a time zone that differs from the database and the calculation of the envelope ts_ms - source ts_ms results in a variance caused by the time zone. By using the TimezoneConverter to convert metadata fields, you can easily calculate the lag between those two fields without the time zone interfering.

Signal table watermark metadata

An incremental snapshot process requires a signal table to write open/close markers to coordinate the change boundaries with the data recorded in the transaction logs, unless you’re using MySQL’s read-only flavor. In some cases, users would like to be able to track the window time slot, knowing when the window was opened and closed.

Starting with Debezium 2.6, the data column in the signal table will be populated with the time window details, allowing users to obtain when the window was opened and closed. The following shows the details of the data column for each of the two signal markers:

Window Open Marker

{"openWindowTimestamp": "<window-open-time>"}

Window Close Marker

{"openWindowTimestamp": "<window-open-time>", "closeWindowTimestamp": "<window-close-time>"}

TRACE level logging for Debezium Server

With Debezium 2.6+ release, this will no longer be required. The build time configuration has been adjusted by default to include TRACE logging levels, so moving forward users can simply set the log level to TRACE and restart Debezium Server to obtain the logs (DBZ-7369).

Cassandra configurable partition modes

When a Debezium Cassandra connector read the commit logs, events are processed sequentially and added to a queue. If multiple queues exist, events become distributed between these queues based on the hash of the commit log filename. This resulted in situations where events could be emitted in non-chronological order.

With Debezium 2.6, the Cassandra connector’s hashing algorithm now uses the partition column names to resolve the queue index for insertion. This should provide a more stable insert order so that events are emitted in the correct order.

A new configuration option has been added to opt-in to this new behavior. Debezium users can add the new configuration property event.order.guarantee.mode set to partition_values to take advantage of this new mode. By default, the property retains the old behavior using a default of commitlog_file.

Other fixes

In total, 249 issues were resolved in Debezium 2.6. The list of changes can also be found in our release notes. A big thank you to all the contributors from the community who worked diligently on this release: Akula, Amirmohammad Sadat Shokouhi, Andreas Martens, Andrey Pustovetov, Animesh Kumar, Anisha Mohanty, Artem Shubovych, Arthur Le Ray, Bob Roldan, Breno Moreira, Bue Von Hun, Chris Cranford, Ciaran O’Reilly, Clifford Cheefoon, Enzo Cappa, Gunnar Morling, Harvey Yue, Ilyas Ahsan, Inki Hwang, Jakub Cechacek, James Johnston, Jan.Lieskovsky, Jiri Kulhanek, Jiri Novotny, Jiri Pechanec, Jordan Pittier, Lars M. Johansson, Lourens Naudé, Luca Scannapieco, Mario Fiore Vitale, Martin Medek, Mickael Maison, Mostafa Ghadimi, Nancy Xu, Nick Golubev, Ondrej Babec, Peter Hamer, René Kerner, Richard Harrington, Robert Roldan, Roman Kudryashov, Sean Wu, Sergey Eizner, Sergey Ivanov, Shuran Zhang, Stavros Champilomatis, Thomas Thornton, V K, Vadzim Ramanenka, Vaibhav Kushwaha, Vincenzo Santonastaso, Vojtech Juranek, Xianming Zhou, leoloel, and حمود سمبول!

Outlook & What’s next?

With Debezium 2.6 released, the team has already started work on Debezium 2.7, which will be released later this year in June. This upcoming release will feature a standalone MariaDB connector, user-friendly offset manipulation, read-only incremental snapshots for relational connectors, and possibly a sneak peek at the first PoC for Debezium Server’s UI.

This next quarter is equally ambitious with its deliverables, and we’d like to ask you to join the conversation. You can read all the details on the project’s 2024 road map, and get in touch with us on the mailing list or Zulip chat. We would love to hear your feedback on the road map and any suggestions you may that may not be included.

This upcoming quarter will mark the last and final release in the Debezium 2.x release stream with Debezium 2.7. With a new major release brewing, this is now the time for code clean-up and deprecation removal. If you have not taken a moment to review features that may have been scheduled for removal, we ask that you do and offer your feedback as soon as possible. We want to guarantee that the transition to Debezium 3 is as much of a drop-in replacement as possible, but we cannot do that without your help.

With spring in full swing, don’t forget to stop and enjoy the roses. Until next time…

Chris Cranford

Chris is a software engineer at Red Hat. He previously was a member of the Hibernate ORM team and now works on Debezium. He lives in North Carolina just a few hours from Red Hat towers.

About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.

Breaking changes

New features and improvements

Db2 for iSeries connector

Java 17 now compile-time requirement

Asynchronous Embedded Engine

New Unified Snapshot Modes

New Matching Collections API added

Source transaction id changes

Improved event timestamp precision

Scoped Key/Trust - store support with MongoDB

MongoDB UUID key support for Incremental snapshots

MongoDB post-image changes

Incremental snapshot row-value constructors for PostgreSQL

SQL Server query improvements

Oracle Infinispan cache improvements

Oracle Redo SQL per event with LogMiner

Oracle LogMiner transaction buffer improvements

Oracle LogMiner Hybrid Mining Strategy

XML Support for OpenLogReplicator

Informix appends LSN to Transaction Identifier

Supports Spanner NEW_ROW_AND_OLD_VALUES value capture type

New Arbitrary-based payload formats

TRACE level logging for Debezium Server

Google PubSub Ordering Key Support

CloudEvents schema name customization

Timestamp converter improvements

Signal table watermark metadata

TRACE level logging for Debezium Server

Cassandra configurable partition modes

Other fixes

Outlook & What’s next?

Chris Cranford

About Debezium

Get involved

Supports Spanner `NEW_ROW_AND_OLD_VALUES` value capture type