Debezium Blog

It’s with great pleasure that I am announcing the release of Debezium 1.7.0.Final!

Key features of this release include substantial improvements to the notion of incremental snapshotting (as introduced in Debezium 1.6), a web-based user Debezium user interface, NATS support in Debezium Server, and support for running Apache Kafka without ZooKeeper via the Debezium Kafka container image.

Also in the wider Debezium community some exciting things happened over the last few months; For instance, we saw a CDC connector for ScyllaDB based on the Debezium connector framework, and there’s work happening towards a Debezium Server connector for Apache Iceberg (details about this coming soon in a guest post on this blog).

It’s my pleasure to announce the second release of the Debezium 1.7 series, 1.7.0.Beta1!

This release brings NATS Streaming support for Debezium Server along with many other fixes and enhancements. Also this release is the first one tested with Apache Kafka 2.8.

It’s my pleasure to announce the first release of the Debezium 1.7 series, 1.7.0.Alpha1!

With the summer in a full-swing, this release brings additional improvements to the Debezium Oracle connector but also to the others as well.

I’m pleased to announce the release of Debezium 1.6.0.Final!

This release is packed full with tons of new features, including support for incremental snapshotting that can be toggled using the new the Signal API. Based on the excellent paper DBLog: A Watermark Based Change-Data-Capture Framework by Netflix engineers Andreas Andreakis and Ioannis Papapanagiotou, the notion of incremental snapshotting addresses several requirements around snapshotting that came up repeatedly in the Debezium community:

It’s my pleasure to announce the release of Debezium 1.6.0.CR1!

This release adds skipped operations optimizations for SQL Server, introduces Heartbeat support to the Oracle connector, Oracle BLOB/CLOB support is now opt-in only, and provides a range of bug fixes and other improvements across different Debezium connectors.

It’s my pleasure to announce the release of Debezium 1.6.0.Beta2!

This release adds support for Pravega to Debezium Server, expands the snapshotting options of the Debezium Oracle connector, and provides a range of bug fixes and other improvements across different Debezium connectors.

Let me announce the bugfix release of Debezium 1.5, 1.5.2.Final!

This release is a rebuild of 1.5.1.Final using Java 8.

Let me announce the bugfix release of Debezium 1.5, 1.5.1.Final!

This release fixes a small set of issues discovered since the original release and few improvements into the documentation.

I’m pleased to announce the release of Debezium 1.6.0.Beta1!

This release introduces incremental snapshot support for SQL Server and Db2, performance improvements for SQL Server, support for BLOB/CLOB for Oracle, and much more. Lets take a few moments and explore some of these new features in the following.

It’s my pleasure to announce the first release of the Debezium 1.6 series, 1.6.0.Alpha1!

This release brings the brand new feature called incremental snapshots for MySQL and PostgreSQL connectors, a Kafka sink for Debezium Server, as well as a wide range of bug fixes and other small feature additions.

I’m thrilled to announce the release of Debezium 1.5.0.Final!

With Debezium 1.5, the LogMiner-based CDC implementation for Oracle moves from Incubating to Stable state, and there’s a brand-new implementation of the MySQL connector, which brings features like transaction metadata support. Other key features include support for a new "signalling table", which for instance can be used to implement schema changes with the Oracle connector, and support for TRUNCATE events with Postgres. There’s also many improvements to the community-led connectors for Vitess and Apache Cassandra, as well as wide range of bug fixes and other smaller improvements.

It’s my pleasure to announce the release of Debezium 1.5.0.CR1!

As we begin moving toward finalizing the Debezium 1.5 release stream, the Oracle connector has been promoted to stable and there were some TLS improvements for the Cassandra connector, as well as numerous bugfixes. Overall, 50 issues have been addressed for this release.

We are very happy to announce the release of Debezium 1.5.0.Beta2!

The main features of this release is the new Debezium Signaling Table support, Vitess SET type support, and a continued focus to minor improvements, bugfixes, and polish as we sprint to the finish line for the 1.5 release.

Overall, the community fixed 54 issues since the Beta1 release, some of which we’ll explore more in-depth below.

I’m very happy to announce the release of Debezium 1.5.0.Beta1!

This release adds a brand-new component — the web-based Debezium UI --, transaction metadata support for the MySQL connector, a large number of improvements to the LogMiner-based capture implementation for the Debezium Oracle connector, support for Vitess 9.0, and much more. Let’s explore some of the new features in the following.

It’s my pleasure to announce the first release of the Debezium 1.5 series, 1.5.0.Alpha1!

This release brings many improvements to the LogMiner-based capture implementation for the Debezium Oracle connector, a large overhaul of the MySQL connector, as well as a wide range of bug fixes and other small feature additions.

I’m pleased to announce the release of Debezium 1.4.1.Final!

We highly recommend upgrading from 1.4.0.Final and earlier versions as this release includes bug fixes and enhancements to several Debezium connectors which includes some of the following:

I am pleased to announce the release of Debezium 1.4.0.Final!

This release concludes the major work put into Debezium over the last three months. Overall, the community fixed 117 issues during that time, including the following key features and changes:

  • New Vitess connector, featured in an in-depth blog post by Kewei Shang

  • Fine-grained selection of snapshotted tables

  • PostgreSQL Snapshotter completion hook

  • Distributed Tracing

  • MySQL support for create or read records emitted during snapshot

  • Many Oracle Logminer adapter improvements

  • Full support for Oracle JDBC connection strings

  • Improved reporting of DDL errors

I’m pleased to announce the release of Debezium 1.4.0.CR1!

This release focuses primarily on polishing the 1.4 release.

I’m pleased to announce the release of Debezium 1.4.0.Beta1!

This release includes support for distributed tracing, lowercase table and schema naming for Db2, specifying MySQL snapshot records as create or read operations, and enhancements to Vitess for nullable and primary key columns.

I’m excited to announce the release of Debezium 1.4.0.Alpha2!

This second pass of the 1.4 release line provides a few useful new features:

  • New API hook for the PostgreSQL Snapshotter interface

  • Field renaming using ExtractNewRecordState SMT’s add.fields and add.headers configurations

I’m excited to announce the release of Debezium 1.3.1.Final!

This release primarily focuses on bugs that were reported after the 1.3 release. Most importantly, the following bugs were fixed related to the Debezium connector for Oracle LogMiner adapter thanks to the continued feedback by the Debezium community.

  • SQLExceptions thrown when using Oracle LogMiner (DBZ-2624)

  • LogMiner mining session stopped due to WorkerTask killed (DBZ-2629)

I am excited to announce the release of Debezium 1.4.0.Alpha1!

This first pass of the 1.4 release line provides a few useful new features:

  • New Vitess connector

  • Allow fine-grained selection of snapshotted tables

Overall, the community fixed 41 issues for this release. Let’s take a closer look at some of the highlights.

It’s with great please that I’m announcing the release of Debezium 1.3.0.Final!

As per Debezium’s quarterly release cadence, this wraps up the work of the last three months. Overall, the community has fixed 138 issues during that time, including the following key features and changes:

  • A new incubating LogMiner-based implementation for ingesting change events from Oracle

  • Support for Azure Event Hubs in Debezium Server

  • Upgrade to Apache Kafka 2.6

  • Revised filter option names

  • A new SQL Server connector snapshot mode, initial_only

  • Support for database-filtered columns for SQL Server

  • Additional connection options for the MongoDB connector

  • Improvements to ByteBufferConverter for implementing the outbox pattern with Avro as the payload format

I’m very happy to announce the release of Debezium 1.3.0.CR1!

As we approach the final stretch of Debezium 1.3 Final, we took this opportunity to add delegate converter support for the ByteBufferConverter and introduce a debezium-scripting module. In addition, there’s also a range of bug fixes and quite a bit of documentation polish; overall, not less than 15 issues have been resolved for this release.

I’m very happy to announce the release of Debezium 1.3.0.Beta2!

In this release we’ve improved support for column filtering for the MySQL and SQL Server connectors, and there’s a brand-new implementation for ingesting change events from Oracle, using the LogMiner package. As we’re on the home stretch towards Debezium 1.3 Final, there’s also a wide range of smaller improvements, bug fixes and documentation clarifications; overall, not less than 44 issues have been resolved for this release.

It’s my pleasure to announce the release of Debezium 1.3.0.Beta1!

This release upgrades to the recently released Apache Kafka version 2.6.0, fixes several critical bugs and comes with a renaming of the connector configuration options for selecting the tables to be captured. We’ve also released Debezium 1.2.2.Final, which is a drop-in replacement for all users of earlier 1.2.x releases.

I’m excited to announce the release of Debezium 1.3.0.Alpha1!

This initial pass in the 1.3 release line provides a number of useful new features:

  • A new Debezium Server sink adapter for Azure Event Hubs

  • A new SQL Server connector snapshot mode, initial_only

  • Additional connection timeout options for the MongoDB Connector

Overall, the community fixed not less than 31 issues for this release. Let’s take a closer look at some of them in the remainder of this post.

I’m very happy to announce the release of Debezium 1.2.0.Final!

Over the last three months, the community has resolved nearly 200 issues. Key features of this release include:

  • New Kafka Connect single message transforms (SMTs) for content-based event routing and filtering; Upgrade to Apache Kafka 2.5

  • Schema change topics for the Debezium connectors for SQL Server, Db2 and Oracle

  • Support for SMTs and message converters in the Debezium embedded engine

  • Debezium Server, a brand-new runtime which allows to propagate data change events to a range of messaging infrastructures like Amazon Kinesis, Google Cloud Pub/Sub, and Apache Pulsar

  • A new column masking mode "consistent hashing", allowing to anonymize column values while still keeping them correlatable

  • New metrics for the MongoDB connector

  • Improved re-connect capability for the SQL Server connector

It’s my pleasure to announce the release of Debezium 1.2.0.CR1!

This release includes several notable features, enhancements, and fixes:

  • PostgreSQL can restrict the set of tables with a publication while using pgoutput (DBZ-1813).

  • Metrics MBean registration is skipped if a platform MBean server does not exist (DBZ-2089).

  • SQL Server reconnection improved during shutdown and connection resets (DBZ-2106).

  • EventRouter SMT can now pass non-String based keys (DBZ-2152).

  • PostgreSQL include.unknown.datatypes can now return strings rather than hashes (DBZ-1266).

  • Debezium Server now supports Google Cloud PubSub (DBZ-2092).

  • Debezium Server now supports Apache Pulsar sink (DBZ-2112).

You can find the complete list of addressed issues, upgrade procedures, and notes on any backward compatibility changes in the release notes.

I’m very happy to share the news that Debezium 1.2.0.Beta2 has been released!

Core feature of this release is Debezium Server, a dedicated stand-alone runtime for Debezium, opening up its open-source change data capture capabilities towards messaging infrastructure like Amazon Kinesis.

Overall, the community has fixed 25 issues since the Beta1 release, some of which we’re going to explore in more depth in the remainder of this post.

With great happiness I’m announcing the release of Debezium 1.2.0.Beta1!

This release brings user-facing schema change topics for the SQL Server, Db2 and Oracle connectors, a new message transformation for content-based change event routing, support for a range of array column types in Postgres and much more. We also upgraded the Debezium container images for Apache Kafka and Kafka Connect to version 2.5.0.

As it’s the answer to all questions in life, the number of issues fixed for this release is exactly 42!

I’m very happy to announce the release of Debezium 1.2.0.Alpha1!

This first drop of the 1.2 release line provides a number of useful new features:

  • Support for message transformations (SMTs) and converters in the Debezium embedded engine API

  • A new SMT for filtering out change events using scripting languages

  • Automatic reconnects for the SQL Server connector

  • A new column masking mode using consistent hash values

Overall, the community fixed not less than 41 issues for this release. Let’s take a closer look at some of them in the remainder of this post.

It’s with great excitement that I’m announcing the release of Debezium 1.1.0.Final!

About three months after the 1.0 release, this new version comes with many exciting new features such as:

It’s my pleasure to announce the release of Debezium 1.1.0.Beta1!

This release adds support for transaction marker events, an incubating connector for the IBM Db2 database as well as a wide range of bug fixes. As the 1.1 release still is under active development, we’ve backported an asorted set of bug fixes to the 1.0 branch and released Debezium 1.0.1.Final, too.

At the time of writing this, not all connector archives have been synched to Maven Central yet; this should be the case within the next few others.

Today it’s my great pleasure to announce the availability of Debezium 1.0.0.Final!

Since the initial commit in November 2015, the Debezium community has worked tirelessly to realize the vision of building a comprehensive open-source low-latency platform for change data capture (CDC) for a variety of databases.

Within those four years, Debezium’s feature set has grown tremendously: stable, highly configurable CDC connectors for MySQL, Postgres, MongoDB and SQL Server, incubating connectors for Apache Cassandra and Oracle, facilities for transforming and routing change data events, support for design patterns such as the outbox pattern and much more. A very active and welcoming community of users, contributors and committers has formed around the project. Debezium is deployed to production at lots of organizations from all kinds of industries, some with huge installations, using hundreds of connectors to stream data changes out of thousands of databases.

The 1.0 release marks an important milestone for the project: based on all the production feedback we got from the users of the 0.x versions, we figured it’s about time to express the maturity of the four stable connectors in the version number, too.

While fall weather is in full swing, the Debezium community is not letting the unusually low, frigid temperatures get the best of us. It is my pleasure to announce the release of Debezium 1.0.0.Beta3!

This new Debezium release includes several notable new features, enhancements, and fixes:

  • Built against Kafka Connect 2.3.1 (DBZ-1612)

  • Renamed drop_on_stop configuration parameter to drop.on.stop (DBZ-1595)

  • Standardized source information for Cassandra connector (DBZ-1408)

  • Propagate MongoDB replicator exceptions so they are visible from Kafka Connect’s status endpoint (DBZ-1583)

  • Envelope methods should accept Instant rather than long values for timestamps (DBZ-1607)

  • Erroneously reporting no tables captured (DBZ-1519)

  • Avoid Oracle connector attempting to analyze tables (DBZ-1569)

  • Toasted columns should contain null in before rather than __debezium_unavailable_value (DBZ-1570)

  • Support PostgreSQL 11+ TRUNCATE operations using pgoutput decoder (DBZ-1576)

  • PostgreSQL connector times out in schema discovery for databases with many tables (DBZ-1579)

  • Value of ts_ms is not correct duing snapshot processing (DBZ-1588)

  • Heartbeats are not generated for non-whitelisted tables (DBZ-1592)

It is my pleasure to announce the release of Debezium 1.0.0.Beta2!

This new Debezium release includes several notable new features, enhancements, and fixes:

  • Support PostgreSQL LTREE columns with a logical data type (DBZ-1336)

  • Support for PostgreSQL 12 (DBZ-1542)

  • Validate configured PostgreSQL replication slot not contains no invalid characters (DBZ-1525)

  • Add MySQL DDL parser support for index creation VISIBLE and INVISIBLE keywords (DBZ-1534)

  • Add MySQL DDL parser support for granting SESSION_VARIABLES_ADMIN (DBZ-1535)

  • Fix MongoDB collection source struct field when collection name contains a dot (DBZ-1563)

  • Close idle transactions after performing a PostgreSQL snapshot (DBZ-1564)

History is in the making as Debezium begins to sprint to its 1.0 milestone. It’s my pleasure to announce the release of Debezium 1.0.0.Beta1!

This new Debezium release includes several notable new features, enhancements, and fixes:

  • ExtractNewDocumentState and EventRouter SMTs propagate heartbeat & schema change messages (DBZ-1513)

  • Provides alternative mapping for INTERVAL columns via interval.handling.mode (DBZ-1498)

  • Ensure message keys have the right column order (DBZ-1507)

  • Warn of table locking problems in connector logs (DBZ-1280)

On behalf of the Debezium community it’s my great pleasure to announce the release of Debezium 0.10.0.Final!

As you’d expect it, there were not many changes since last week’s CR2, one exception being a performance fix for the pgoutput plug-in of the Postgres connector, which may have suffered from slow processing when dealing with many small transactions in a short period of time (DBZ-1515).

This release finalizes the work of overall eight preview releases. We have discussed the new features and changes in depth in earlier announcements, but here are some highlights of Debezium 0.10:

I’m very happy to announce the release of Debezium 0.10.0.CR2!

After the CR1 release we decided to do another candidate release, as there was not only a good number of bug fixes coming in, but also a few very useful feature implementations were provided by the community, which we didn’t want to delay. So we adjusted the original plan a bit and now aim for Debezium 0.10 Final in the course of next week, barring any unforeseen regressions.

As usual, let’s take a closer look at some of the new features and resolved bugs.

The Debezium community is on the homestretch towards the 0.10 release and we’re happy to announce the availability of Debezium 0.10.0.CR1!

Besides a number of bugfixes to the different connectors, this release also brings a substantial improvement to the way initial snapshots can be done with Postgres. Unless any major regressions show up, the final 0.10 release should follow very soon.

This post originally appeared on the WePay Engineering blog.

In the first half of this blog post series, we explained our decision-making process of designing a streaming data pipeline for Cassandra at WePay. In this post, we will break down the pipeline into three sections and discuss each of them in more detail:

  1. Cassandra to Kafka with CDC agent

  2. Kafka with BigQuery with KCBQ

  3. Transformation with BigQuery view

This post originally appeared on the WePay Engineering blog.

Historically, MySQL had been the de-facto database of choice for microservices at WePay. As WePay scales, the sheer volume of data written into some of our microservice databases demanded us to make a scaling decision between sharded MySQL (i.e. Vitess) and switching to a natively sharded NoSQL database. After a series of evaluations, we picked Cassandra, a NoSQL database, primarily because of its high availability, horizontal scalability, and ability to handle high write throughput.