Debezium 3.3.0.Alpha2 is out, bringing key fixes and powerful enhancements!

Highlights include heartbeat handling fixes, the ability to start MongoDB streaming from a precise oplog position, faster PostgreSQL TOAST performance, extended TSVECTOR support in the JDBC sink, and improved publication DDL handling in PostgreSQL. The Debezium Platform also gets major usability boosts with clearer error messages, fine-grained UI logging, and better source/destination definitions.

New features and improvements

The following describes all noteworthy new features and improvements in Debezium 3.3.0.Alpha2. For a complete list, be sure to read the release notes for more details.

Heartbeats are no longer emitted constantly

In Debezium 3.3.0.Alpha1, users reported problems when using heartbeat.action.query that Debezium was emitting heartbeat events constantly regardless of the configured interval. This regression is fixed and heartbeat.action.query should now honor the configured heartbeat.interval.ms once again (DBZ-9340).

Start MongoDB from specific position

Debezium users can now start the MongoDB source connector at a specific position in the MongoDB oplog by specifying a new connector configuration property, capture.start.op.time, in the connector configuration. This new configuration property should be a long data type value that represents the Bson timestamp (DBZ-9240).

Leaving this configuration property in the connector configuration will result in the connector attempting to resume from the specified position when restarted.

It’s recommended when using this feature that once the connector begins to stream changes the property is removed so that any future restarts will honor the resume position present in the connector offsets instead.

PostgreSQL TSVECTOR data type support in JDBC sink

In Debezium 3.3.0.Alpha1 we introduced support for the text-search based vector data type called TSVECTOR as part of the PostgreSQL source connector (DBZ-8470). In this release, we’ve extended that support to the JDBC sink connector so that TSVECTOR values can be written to PostgreSQL targets (DBZ-8471). If the target is a non-PostgreSQL database, the value will be written into a character-based column instead.

PostgreSQL Publication DDL timeout

While we generally do not document internal configuration properties, we did add a new internal PostgreSQL connector configuration property internal.create.slot.command.timeout in the past to apply a set default timeout of 90 seconds when creating the connector’s replication slot. This was to address concerns with blocking transactions that would prevent the connector from advancing as a replication slot cannot be created while transactions are active.

We’ve extended the coverage for the timeout in Debezium 3.3 to apply to the DDL operations for creating and altering the PostgreSQL connector’s publication (https://issues.redhat.com/browse/DBZ-9310). If you notice timeouts creating/updating the publication or the slot, you may want to increase this configuration property (defaults to 90) or set it to 0 to disable the timeout feature.

Improved PostgreSQL TOAST-column performance

The Debezium for PostgreSQL pgoutput decoder uses a specific pattern to determine whether a toasted column’s value matches a predefined list of marker objects that indicate the value is absent in the change event. However, this pattern was inefficient when the event payload contained large text or binary data, due to the cost of computing hash values before comparison.

To improve performance, the implementation now uses a direct equality check, avoiding expensive hash computations for large TOAST column payloads (DBZ-9345). This change reduces processing overhead when handling events with sizable text or binary data.

SQL Server heartbeat improvements

The heartbeat behavior in SQL Server will now emit heartbeat events during periods where there are no changes in the capture instances for CDC-based tables (DBZ-9364). This should help make sure that while the LSN continues to advance in the database due to changes to non-captured tables, the offsets remain synchronized.

Debezium Platform improvements

Debezium 3.3 also brings several new improvements to the Debezium Platform to improve the user experience.

One of the first improvements is around error handling and messaging. We’ve improved this process to provide significantly better descriptions to help users in the user interface (DBZ-8836), as shown here:

Platform Error Improvements

In addition, the Platform now provides users the ability to define fine-grained logging configuration as the user interface, which can be extremely useful when debugging or diagnosing a connector-related problem (DBZ-8890), seen here:

Platform Logging Improvements

And lastly where were some improvements around adding details when defining source and destination types (DBZ-9373). We’ve included a video below that outlines how these work

Other changes

  • Incremental snapshot offset failing to load on task restart DBZ-9209

  • Debezium Server Azure Event Hubs sink duplicates all previous events DBZ-9304

  • Archive log only mode does not pause mining when no more data available DBZ-9306

  • Create rest resource for the connection DBZ-9313

  • Events may be mistakenly processed multiple times using multiple tasks DBZ-9338

  • Allow redo thread flush scn adjustment to be configurable DBZ-9344

  • Fetching transaction event count can result in NullPointerException DBZ-9349

  • Ensure JAVA_OPTS env var is correct on dbz server startup DBZ-9352

  • Issue in ReselectColumnsPostProcessor when field’s schema type is BYTES DBZ-9356

  • MariaDB fails to parse ALTER TABLE using RENAME COLUMN IF EXISTS syntax DBZ-9358

  • Oracle fails to reselect columns when table structure changes and throws ORA-01466 DBZ-9359

  • Single quotes getting double quotes in a create operation DBZ-9366

  • Mining upper boundary is miscalculated when using archive log only mode DBZ-9370

  • Proper Kafka producer exception not logged due to record.key serialisation error DBZ-9378

In total, 38 issues were resolved in Debezium 3.3.0.Alpha2. The list of changes can also be found in our release notes.

Chris Cranford

Chris is a software engineer at IBM and formerly Red Hat where he works on Debezium and deepens his expertise in all things Oracle and Change Data Capture on a daily basis. He previously worked on Hibernate, the leading open-source JPA persistence framework, and continues to contribute to Quarkus. Chris is based in North Carolina, United States.

   


About Debezium

Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.

Get involved

We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.

×