While it has only been two short weeks since our first preview release for the Debezium 2.5 release stream, I am happy to announce the immediate availability of the next preview release, Debezium 2.5.0.Alpha2.

This release includes a variety of improvements, batch support for the JDBC Sink connector, seamless support for MongoDB documents that exceed the 16MB barrier, MySQL 8.2 compatibility, and signal improvements for SQL Server. Additionally, this release includes a variety of bug fixes and several breaking changes.

Let’s take a closer look at these changes and improvements that are included in Debezium 2.5.0.Alpha2; as always, you can find the complete list of changes for this release in the release notes. Please remember to take special note to any breaking changes that could affect your upgrade path.

Breaking changes

While we strive to avoid breaking changes, sometimes those changes are inevitable to evolve in the right direction. This release includes a variety of breaking changes.

MongoDB default connection mode changed

The upgrade to Debezium 2.5 brings a change to MongoDB’s default implementation. In previous builds, the default connection mode was replica_set; however with Debezium 2.5 this is now sharded. If you were connecting to a sharded cluster and not explicitly setting a connection mode; ergo, relying on the default behavior, you must review your connector configuration and make adjustments. (DBZ-7108)

Overall, this change is part of larger effort to remove the replica_set mode entirely. Please be sure to review your connector configurations for all MongoDB connectors when upgrading.

This breaking change invalidates existing connector offsets and a new snapshot will be triggered by default when upgrading. If a snapshot is not needed or wanted, you will need to adjust your connector configuration’s snapshot.mode accordingly.

Debezium Embedded Engine Deprecated APIs removed

Part of the team’s focus in Debezium 2.5 was to improve the Debezium Embedded Engine’s experience. With that goal in mind, we took this preview release as an opportunity to clean-up the embedded engine’s API.

If your usage of the Debezium Embedded Engine utilized any of the previously deprecated APIs on EmbeddedEngine, you will find those methods have since been removed. (DBZ-7100) The recommended path forward is to make sure that you’re using the DebeziumEngine interface provided by the debezium-api artifact.

MySQL 5.7 support now best-effort

The MySQL community announced that MySQL 5.7 would enter its End of Life cycle at the end of October 2023, or just last month. This means that the MySQL community has no plans to continue offering security or bug fix patches for that edition of MySQL.

In accordance with this upstream community news, Debezium too is making adjustments, like other vendors, around this recent news. To that end, starting with Debezium 2.5, we will no longer be testing nor supporting MySQL 5.7 if full capacity, thus MySQL 5.7 enters what we call "best-effort" support. (DBZ-6874)

CloudEvents - configuration option renamed

If you are presently using the CloudEvents converter to emit events that conform to the CloudEvents format, it’s important to note that the configuration option metadata.location was renamed to metadata.source. You will need to be sure to update your connector configurations to reflect this change with Debezium 2.5 and onward. (DBZ-7060)

New features and improvements

Debezium 2.5 also introduces quite a number of improvements, lets take a look at each of these individually.

JDBC Sink Batch Support

Debezium first introduced the JDBC sink connector in March 2023 as a part of Debezium 2.2. Over the last several months, this connector has seen numerous iterations to improve its stability, feature set, and capabilities. Debezium 2.5 builds atop of those efforts, introducing batch-writes. (DBZ-6317)

In previous versions, the connector worked on each topic event separately; however, the new batch-write support mode will collect the events into buckets and write those changes to the target system using the fewest possible transaction boundaries as possible. This change increases the connector’s throughput capabilities and makes the interactions with the target database far more efficient.

Seamless MongoDB large document handling

Debezium has introduced several changes around large document processing in recent releases; however, those changes primarily focused on handling that use case with MongoDB 4 and 5. While these improvements certainly help for those older versions, the MongoDB community has introduced a way in MongoDB 6 to seamlessly deal with this at the database pipeline level.

Debezium 2.5’s MongoDB connector now uses the $changeStreamSplitLargeEvent aggregation feature, introduced as part of MongoDB 6.0.9. This avoids the BSONObjectTooLarge exception when working with documents that would exceed the 16MB document size limit of MongoDB. This new feature is controlled by the oversize.handling.mode option, which defaults to fail. Please adjust this configuration if you would like to take advantage of this new, opt-in feature. (DBZ-6726)

Debezium is simply utilizing an underlying feature of the MongoDB database. As such, the database still has some limitations discussed in the MongoDB documentation that could still lead to exceptions with large documents that don’t adhere to MongoDB’s split rules.

MySQL 8.2 support

The MySQL community recently released a new innovation release, MySQL 8.2.0 at the end of October 2023. This new release has been tested with Debezium and we’re happy to announce that we officially support this new innovation release. (DBZ-6873)

SQL Server Notification Improvements

Debezium for SQL Server works by reading the changes captured by the database in what are called capture instances. These instances can come and go based on a user’s needs, and it can be difficult to know if Debezium has concluded its own capture process for a given capture instance.

Debezium 2.5 remedies this problem by emitting a new notification aggregate called Capture Instance, allowing any observer to realize when a capture instance is no longer in use by Debezium. This new notification includes a variety of connector details including the connector’s name along with the start, stop, and commit LSN values. (DBZ-7043)

Redis Schema History Retries now Limited

Debezium 2.5 introduces a new configuration option, schema.history.internal.redis.max.attempts designed to limit the number of retry attempts while connecting to a Redis database when it becomes unavailable, previously it simply retried forever. This new option defaults to 10 but is user configurable. (DBZ-7120)

SQL Server Driver Updates

SQL Serer 2019 introduced the ability to specify column-specific sensitivity classifications to provide better visibility and protections for sensitive data. Unfortunately, the current driver shipped with Debezium 2.4 and earlier does not support this feature. Debezium 2.5 introduces the latest 12.4.2 SQL Server driver so that users can now take advantage of this feature out of the box. (DBZ-7109)

Debezium Server Kinesis Sink Improvements

Debezium Server Kinesis users will be happy to note that there has been some reliability improvements with the sink adapter with Debezium 2.5. The Kinesis Sink will now automatically retry the delivery of a failed record up to a maximum of 5 attempts before the adapter triggers a failure. This should improve the sink adapter’s delivery reliability and help situations where a batch of changes may overload the sink’s endpoint. (DBZ-7032)

Other fixes

In addition, there were quite a number of stability and bug fixes that made it into this release. These include the following:

  • Oracle RAC throws ORA-00310: archive log sequence required DBZ-5350

  • oracle missing CDC data DBZ-5656

  • Missing oracle cdc records DBZ-5750

  • Add (integration) tests for Oracle connector-specific Debezium Connect REST extension DBZ-6763

  • Intermittent failure of MongoDbReplicaSetAuthTest DBZ-6875

  • Connector frequently misses commit operations DBZ-6942

  • Missing events from Oracle 19c DBZ-6963

  • Mongodb tests in RHEL system testsuite are failing with DBZ 2.3.4 DBZ-6996

  • Use DebeziumEngine instead of EmbeddedEngine in the testsuite DBZ-7007

  • Debezium Embedded Infinispan Performs Slowly DBZ-7047

  • Field exclusion does not work with events of removed fields DBZ-7058

  • Update transformation property "delete.tombstone.handling.mode" to debezium doc DBZ-7062

  • JDBC sink connector not working with CloudEvent DBZ-7065

  • JDBC connection leak when error occurs during processing DBZ-7069

  • Some server tests fail due to @com.google.inject.Inject annotation DBZ-7077

  • Add MariaDB driver for testing and distribution DBZ-7085

  • Allow DS JMX to use username-password authentication on k8 DBZ-7087

  • HttpIT fails with "Unrecognized field subEvents" DBZ-7092

  • MySQL parser does not conform to arithmetical operation priorities DBZ-7095

  • VitessConnectorIT.shouldTaskFailIfColumnNameInvalid fails DBZ-7104

  • When RelationalBaseSourceConnector#validateConnection is called with invalid config [inside Connector#validate()] can lead to exceptions DBZ-7105

  • Debezium crashes on parsing MySQL DDL statement (specific INSERT) DBZ-7119

Altogether, 33 issues were fixed for this release. A big thank you to all the contributors from the community who worked on this release: Anatolii Popov, Anisha Mohanty, Bob Roldan, Chris Cranford, Harvey Yue, Ilyas Ahsan, Jakub Cechacek, Jiri Pechanec, Mario Fiore Vitale, Ondrej Babec, Rafael Câmara, René Kerner, Roman Kudryashov, Vadzim Ramanenka, Vojtech Juranek, and 蔡灿材!

What’s next?

As mentioned in our last release announcement, the cadence for Debezium 2.5 is condensed due to the upcoming holiday season. The next preview release for Debezium 2.5 will be our first and most likely only Beta release, later this month. We plan to conclude the Debezium 2.5 release series with a release candidate most likely the first week of December and a final release mid-way through December, just before the holiday break.

The team is also working on a maintenance release of Debezium 2.4, due out late this week. This update to Debezium 2.4 will bring a host of bug fixes and stability improvements already in Debezium 2.5 to the 2.4 release stream.

We are also moving forward on our review and process for MariaDB support. There will likely be some news on this in the coming weeks as we begin to find a path forward around this particular advancement. The team is also continuing the work on the Debezium Engine improvements, and much more. You can find all the details for our continued plans for Debezium 2.5 on our roadmap.

Lastly, there will be news later this week about the next Debezium community event. Please be on the look-out for this as we’d love to see as many of our community members drop by our virtual event in early December. it’s a great way to meet the engineers who work on Debezium, the community contributors, and ask questions and gain insights into what is all part of Debezium 2.5 and the path forward to 2.6 and 2.7 for next year.

As always, please be sure to get in touch with us on the mailing list or Zulip chat if you have questions or feedback. Until next time, stay warm out there!

Chris Cranford

Chris is a software engineer at Red Hat. He previously was a member of the Hibernate ORM team and now works on Debezium. He lives in North Carolina just a few hours from Red Hat towers.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.