Debezium Blog

It is my pleasure to announce the release of Debezium 1.0.0.Beta2!

This new Debezium release includes several notable new features, enhancements, and fixes:

  • Support PostgreSQL LTREE columns with a logical data type (DBZ-1336)

  • Support for PostgreSQL 12 (DBZ-1542)

  • Validate configured PostgreSQL replication slot not contains no invalid characters (DBZ-1525)

  • Add MySQL DDL parser support for index creation VISIBLE and INVISIBLE keywords (DBZ-1534)

  • Add MySQL DDL parser support for granting SESSION_VARIABLES_ADMIN (DBZ-1535)

  • Fix MongoDB collection source struct field when collection name contains a dot (DBZ-1563)

  • Close idle transactions after performing a PostgreSQL snapshot (DBZ-1564)

As a follow up to the recent Building Audit Logs with Change Data Capture and Stream Processing blog post, we’d like to extend the example with admin features to make it possible to capture and fix any missing transactional data.

In the above mentioned blog post, there is a log enricher service used to combine data inserted or updated in the Vegetable database table with transaction context data such as

  • Transaction id

  • User name who performed the work

  • Use case that was behind the actual change e.g. "CREATE VEGETABLE"

This all works well as long as all the changes are done via the vegetable service. But is this always the case?

What about maintenance activities or migration scripts executed directly on the database level? There are still a lot of such activities going on, either on purpose or because that is our old habits we are trying to change…

Welcome to the Debezium community newsletter in which we share all things CDC related including blog posts, group discussions, as well as StackOverflow questions that are relevant to our user community.

History is in the making as Debezium begins to sprint to its 1.0 milestone. It’s my pleasure to announce the release of Debezium 1.0.0.Beta1!

This new Debezium release includes several notable new features, enhancements, and fixes:

  • ExtractNewDocumentState and EventRouter SMTs propagate heartbeat & schema change messages (DBZ-1513)

  • Provides alternative mapping for INTERVAL columns via interval.handling.mode (DBZ-1498)

  • Ensure message keys have the right column order (DBZ-1507)

  • Warn of table locking problems in connector logs (DBZ-1280)

Let’s talk about TOAST. Toast? No, TOAST!

So what’s that? TOAST (The Oversized-Attribute Storage Technique) is a mechanism in Postgres which stores large column values in multiple physical rows, circumventing the page size limit of 8 KB.

TOAST!

Typically, TOAST storage is transparent to the user, so you don’t really have to care about it. There’s an exception, though: if a table row has changed, any unchanged values that were stored using the TOAST mechanism are not included in the message that Debezium receives from the database, unless they are part of the table’s replica identity. Consequently, such unchanged TOAST column value will not be contained in Debezium data change events sent to Apache Kafka. In this post we’re going to discuss different strategies for dealing with this situation.