It’s my pleasure to announce the next release of the Debezium 2.3 series, 2.3.0.Beta1!

While this release focuses primarily on bug fixes and stability improvements, there are some new improvements with the PostgreSQL connector and the new notification and channels subsystem. In addition, there are also some compatibility breaking changes.

This release contains changes for 22 issues, so lets take a moment and dive into the new features and any potential bug fixes or breaking changes that are noteworthy!

Breaking Changes

Debezium recently introduced the JDBC storage module that allows you to store offsets and schema history data inside a relational database. The JDBC storage module used UTF-16 as its default encoding; however, most databases use UTF-8. This release of Debezium aligns the JDBC storage module’s encoding to use UTF-8 moving forward.

PostgreSQL Replica Identity Changes

Debezium 2.3 introduces a new PostgreSQL connector feature called "Autoset Replica Identity".

Replica identity is PostgreSQL’s way to identify what columns are captured in the database transaction logs for inserts, updates, and deletes. This new feature allows configuring a table’s replica identity via connector configuration and delegating the responsibility of setting this configuration to the connector at start-up.

The new configuration option, replica.identity.autoset.values, specifies a comma-separated list of table and replica identity tuples. If the table already has a given replica identity, the identity will be overwritten to match what is specified in this configuration if the table is included. PostgreSQL supports several replica identity types, more information on these can be found in the documentation.

When specifying the replica.identity.autoset.values, the value is a comma-separated list of values where each element uses the format of <fully-qualified-table-name>:<replica-identity>. An example is shown below where two tables are configured to have full replica identity:

{
  "replica.identity.autoset.values": "public.table1:FULL,public.table2:FULL"
}

Be mindful that if the user account used by the connector does not have the appropriate database permissions to set a table’s replica identity, the use of this feature will result in a failure. In the event of a failure due to permissions, you must make sure the proper replica identity is set manually using a database account with the right permissions.

Correlate Incremental Snapshot notification ids

Debezium 2.3 introduces a new notification and channels subsystem. This subsystem allows you to send a signal via a variety of channels that include the filesystem, Kafka topic, and database table out of the box; however, the feature is extendable. In addition, this subsystem also includes the ability to send notifications about the status of the initial snapshots and incremental snapshots if they’re used. These notifications can help facilitate an easier line of communication between Debezium and other third-party systems that may need to know when an incremental or traditional snapshot has finished and whether it finished successfully or not.

In this release, the notification and channels subsystem has been improved to correlate the signal to the notification. So when you send a signal and it is consumed by Debezium, any notification that is raised will contain a reference to the signal, allowing any third-party or external process to know precisely which signal the notification references.

This should help close the gap in distribution communications across applications or processes relying on the new notification and channel subsystem.

Other fixes

There were quite a number of bugfixes and stability changes in this release, some noteworthy are:

  • Debezium Server stops sending events to Google Cloud Pub/Sub DBZ-5175

  • Snapshot step 5 - Reading structure of captured tables time too long DBZ-6439

  • Oracle parallel snapshots do not properly set PDB context when using multitenancy DBZ-6457

  • [MariaDB] Add support for userstat plugin keywords DBZ-6459

  • Debezium Server cannot recover from Google Pub/Sub errors DBZ-6461

  • Db2 connector can fail with NPE on notification sending DBZ-6485

  • BigDecimal fails when queue memory size limit is in place DBZ-6490

  • ORACLE table can not be captrued, got runtime.NoViableAltException DBZ-6492

  • Signal poll interval has incorrect default value DBZ-6496

  • Oracle JDBC driver 23.x throws ORA-18716 - not in any time zone DBZ-6502

  • Alpine postgres images should use llvm/clang 15 explicitly DBZ-6506

  • ExtractNewRecordState SMT in combination with HeaderToValue SMT results in Unexpected field name exception DBZ-6486

Altogether, 22 issues were fixed for this release. A big thank you to all the contributors from the community who worked on this release: Angshuman Dey, Anisha Mohanty, Chris Cranford, Harvey Yue, Ismail Simsek, Jakub Cechacek, Jiri Pechanec, Jochen Schalanda, Kanthi Subramanian, Mario Fiore Vitale, Martin Medek, and Vojtech Juranek!

What’s next?

With Debezium 2.3 being released under a condensed schedule, you can expect the next CR1 release within the next 1-2 weeks. The plan is to release Debezium 2.3.0.Final in the middle of June and for the team to begin preparation on Debezium 2.4.

As we begin to prepare to move toward Debezium 2.4, we would love to hear your feedback or suggestions. The roadmap will be updated in the coming week, so please be sure to get in touch with us on the mailing list or our chat if you have any ideas or suggestions.

Until next time…​

Chris Cranford

Chris is a software engineer at Red Hat. He previously was a member of the Hibernate ORM team and now works on Debezium. He lives in North Carolina just a few hours from Red Hat towers.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.