The Debezium release cadence is in full swing as I’m excited to announce Debezium 2.1.2.Final!
This release focuses primarily on bug fixes and stability; and it is the recommended update for all users from earlier versions. This release contains 28 resolved issues, so let’s take a moment and discuss a critical breaking change.
It’s my pleasure to announce not only the first release of the Debezium 2.2 series, but also the first release of Debezium in 2023, 2.2.0.Alpha!
The Debezium 2.2.0.Alpha1 release includes some breaking changes, a number of bug fixes, and some noteworthy improvements and features, including but not limited to:
[Breaking Change] -
ZonedTimestampvalues will no longer truncate fractional seconds.
[New] - Support ingesting changes from an Oracle logical stand-by database
[New] - Support Amazon S3 buckets using the Debezium Storage API
[New] - Support retrying database connections during connector start-up
[New] - Debezium Server sink connector support for Apache RocketMQ and Infinispan
Today it’s my great pleasure to announce the availability of Debezium 2.1.0.Final!
You might recently noticed that Debezium went a bit silent for the last few weeks. No, we are not going away. In fact the elves in Google worked furiously to bring you a present under a Christmas tree - Debezium Spanner connector.
It’s my pleasure to announce the first release of the Debezium 2.1 series, 2.1.0.Alpha1!
The Debezium 2.1.0.Alpha1 release includes quite a number of bug fixes but also some noteworthy improvements and new features including but not limited to:
Support for PostgreSQL 15
Single Message Transformation (SMT) predicate support in Debezium engine
Capturing TRUNCATE as change event in MySQL table topics
Oracle LogMiner performance improvements
New Redis-based storage module
I’m excited to announce the release of Debezium 1.9.7.Final!
This release focuses on bug fixes and stability; and is the recommended update for all users from earlier versions. This release contains 22 resolved issues overall.
Today it’s my great pleasure to announce the availability of Debezium 2.0.0.Final!
Since our 1.0 release in December 2019, the community has worked vigorously to build a comprehensive open-source low-latency platform for change data capture (CDC). Over the past three years, we have extended Debezium’s portfolio to include a stable connector for Oracle, a community led connector for Vitess, the introduction of incremental snapshots, multi-partition support, and so much more. With the help of our active community of contributors and committers, Debezium is the de facto leader in the CDC space, deployed to production within lots of organizations from across multiple industries, using hundreds of connectors to stream data changes out of thousands of database platforms.
The 2.0 release marks a new milestone for Debezium, one that we are proud to share with each of you.
I am excited to announce the release of Debezium 2.0.0.CR1!
This release contains breaking changes, stability fixes, and bug fixes, all to inch us closer to 2.0.0.Final. Overall, this release contains a total of 53 issues that were fixed.
I’m excited to announce the release of Debezium 1.9.6.Final!
This release focuses on bug fixes and stability; and is the recommended update for all users from earlier versions. This release contains 78 resolved issues overall.
I am excited to announce the release of Debezium 2.0.0.Beta2!
This release contains several breaking changes, stability fixes, and bug fixes, all to inch us closer to 2.0.0.Final. Overall, this release contains a total of 107 issues that were fixed.
I am thrilled to share that Debezium 2.0.0.Beta1 has been released!
This release contains several new features including a pluggable topic selector, the inclusion of database user who committed changes for Oracle change events, and improved handling of table unique indices as primary keys. In addition, there are several breaking changes such as the move to multi-partition mode as default and the introduction of the
debezium-storage module and its implementations. So lets take a look at all these in closer detail.
With the summer in full swing, the team is pleased to announce the release of Debezium 1.9.5.Final!
This release primarily focuses on bugfixes and stability; and is the recommended update for all users from earlier versions. This release contains 24 resolved issues overall.
I am thrilled to share that Debezium 2.0.0.Alpha3 has been released!
While this release contains a plethora of bugfixes, there are a few noteworthy improvements, which include providing a timestamp in transaction metadata events, the addition of several new fields in Oracle’s change event source block, and a non-backward compatible change to the Oracle connector’s offsets.
Lets take a look at these in closer detail.
I’m pleased to announce the release of Debezium 1.9.4.Final!
This release primarily focuses on bugfixes and stability; and is the recommended update for all users from earlier versions. This release contains 32 resolved issues overall.
I am thrilled to share that Debezium 2.0.0.Alpha2 has been released!
This release is packed with tons of bugfixes and improvements, 110 issues resolved in total. Just, WOW!
A few noteworthy changes include incremental snapshots gaining support for regular expressions and a new stop signal. We also did some housekeeping and removed a number of deprecated configuration options and as well as the legacy MongoDB oplog implementation.
Lets take a look at these in closer detail.
As the summer nears, I’m excited to announce the release of Debezium 1.9.3.Final!
This release primarily focuses on bugfixes and stability; however, there are some notable feature enhancements. Lets take a moment to cool off and "dive" into these new features in a bit of detail :).
I am excited to share that Debezium 2.0.0.Alpha1 has been released!
This release is the first of several planned pre-releases of Debezium 2.0 over the next five months. Each pre-release plans to focus on strategic changes in the hope that as we move forward, changes can be easily tested and regressions addressed quickly.
In this release, some of the most notable changes include requiring Java 11 to use Debezium or any of its components, the removal of
wal2json support for PostgreSQL and the legacy MySQL connector implementation, as well as some notable features such as improved Debezium Server Google Pub/Sub sink support, and a multitude of bugfixes. Let’s take a look at a few of these.
I’m excited to announce the release of Debezium 1.9.1.Final!
This release primarily focuses on bugfixes and stability concerns after the 1.9.0.Final release.
The engineering team at Shopify recently improved the Debezium MySQL connector so that it supports incremental snapshotting for databases without write access by the connector, which is required when pointing Debezium to read-only replicas. In addition, the Debezium MySQL connector now also allows schema changes during an incremental snapshot. This blog post explains the implementation details of those features.
I am very happy to share the news that Debezium 1.9.0.Final has been released!
Besides the usual set of bug fixes and improvements, key features of this release are support for Apache Cassandra 4, multi-database support for the Debezium connector for SQL Server, the ability to use Debezium Server as a Knative event source, as well as many improvements to the integration of Debezium Server with Redis Streams.
Exactly 276 issues have been fixed by the community for the 1.9 release; a big thank you to each and everyone who helped to make this happen!
I am happy to announce the release of Debezium 1.9.0.CR1!
Besides a range of bugfixes, this release brings the long-awaited support for Apache Cassandra 4! Overall, 52 issues have been fixed for this release.
Let’s take a closer look at both the Cassandra 3 changes & Cassandra 4 support.
I am happy to announce the release of Debezium 1.9.0.Beta1!
This release includes many new features for Debezium Server, including Knative Eventing support and offset storage management with the Redis sink, multi-partitioned scaling for the SQL Server connector, and various of bugfixes and improvements. Overall, 56 issues have been fixed for this release.
Let’s take a closer look at a couple of them.
It’s my pleasure to announce the second release of the Debezium 1.9 series, 1.9.0.Alpha2!
This release includes support for Oracle 21c, improvements around Redis for Debezium Server, configuring the
kafka.query.timeout.ms option, and a number of bug fixes around DDL parsers, build infrastructure, etc.
Overall, the community fixed 51 issues for this release. Let’s take a closer look at some of the highlights.
It’s my pleasure to announce the first release of the Debezium 1.9 series, 1.9.0.Alpha1!
With the new year comes a new release! The Debezium 1.9.0.Alpha1 release comes with quite a number of fixes and improvements, most notably improved metrics and Oracle ROWID data type support.
It’s my great pleasure to announce the release of Debezium 1.8.0.Final!
Besides a strong focus on the Debezium connector for MongoDB (more on that below), the 1.8 release brings support for Postgres' logical decoding messages, support for configuring SMTs and topic creation settings in the Debezium UI, and much more.
Overall, the community has fixed 242 issues for this release. A big thank you to everyone who helped to make this release happen on time, sticking to our quarterly release cadence!
I’m very excited to announce the release of Debezium 1.8.0.CR1!
As were near the final release due out next week, this release focused heavily on bugfixes. Yet this release includes incremental snapshot support for MongoDB! Overall, not less than 34 issues have been fixed for this release.
Let’s take a closer look at some of them.
The Debezium UI team continues to add support for more features, allowing users to more easily configure connectors. In this article, we’ll describe and demonstrate the UI support for topic automatic creation. Read further for more information, including a video demo!
I’m very happy to announce the release of Debezium 1.8.0.Beta1!
This release is packed with exciting new features like support for MongoDB 5.0, an outbox event router for the MongoDB connector and support for Postgres logical decoding messages, as well as tons of bugfixes and other improvements. Overall, not less than 63 issues have been fixed for this release.
Let’s take a closer look at some of them.
The Debezium UI team is pleased to announce support for Single Message Transformations (SMTs) in the Debezium UI!
Our goal with the Debezium graphical user interface is to allow users to set up and operate connectors more easily. To that end, we have added support for Kafka Connect single message transformations to the UI. Read futher for more information, and for a video demo of the new feature!
It’s my pleasure to announce the second release of the Debezium 1.8 series, 1.8.0.Alpha2!
With the holiday season just around the corner, the team’s release schedule remains steadfast. While Debezium 1.8.0.Alpha2 delivers quite a lot of bugfixes and minor changes, there are a few notable changes:
MySQL support for heartbeat action queries
Configurable transaction topic name
It’s my pleasure to announce the first release of the Debezium 1.8 series, 1.8.0.Alpha1!
With the colors of Autumn upon us, the team has been hard at work painting lines of code for this release. With Debezium 1.8.0.Alpha1 comes quite a number of improvements but most notably is the new native MongoDB 4.0 change streams support!
One of the major improvements in Debezium starting in version 1.6 is support for incremental snapshots. In this blog post we are going to explain the motivation for this feature, we will do a deep dive into the implementation details, and we will also show a demo of it.
It’s with great pleasure that I am announcing the release of Debezium 1.7.0.Final!
Key features of this release include substantial improvements to the notion of incremental snapshotting (as introduced in Debezium 1.6), a web-based user Debezium user interface, NATS support in Debezium Server, and support for running Apache Kafka without ZooKeeper via the Debezium Kafka container image.
Also in the wider Debezium community some exciting things happened over the last few months; For instance, we saw a CDC connector for ScyllaDB based on the Debezium connector framework, and there’s work happening towards a Debezium Server connector for Apache Iceberg (details about this coming soon in a guest post on this blog).
We are very happy to announce the release of Debezium 1.7.0.CR2!
As we are moving ahead towards the final release we include mostly bugfixes. Yet this release contains important performance improvements and a new feature for read-only MySQL incremental snapshots.
I am very happy to announce the release of Debezium 1.7.0.CR1!
For this release, we’ve reworked how column filters are handled during snapshotting, the Debezium container images have been updated to use Fedora 34 as their base, there’s support for MySQL
INVISIBLE columns, and much more.
It’s my pleasure to announce the second release of the Debezium 1.7 series, 1.7.0.Beta1!
This release brings NATS Streaming support for Debezium Server along with many other fixes and enhancements. Also this release is the first one tested with Apache Kafka 2.8.
We are pleased to announce the first official release of the Debezium graphical user interface!
As announced a few months back, our team has been working on a Debezium UI proof-of-concept. The goal of the PoC was to explore ways in which a graphical UI could facilitate the getting started and operational experience of Debezium users.
Debezium is very flexible - each connector can be configured and fine-tuned in a variety of ways. It provides metrics which give the user insight into the state of the running Debezium connectors, allowing the customer to safely operate CDC pipelines in huge installations with thousands of connectors. This flexibility, however, comes with a learning curve for the user to understand all of the different settings and options.
To that end, we have produced a UI which will allow users to set up and operate connectors more easily. The UI is now available as part of the Debezium releases for our community!
It’s my pleasure to announce the first release of the Debezium 1.7 series, 1.7.0.Alpha1!
With the summer in a full-swing, this release brings additional improvements to the Debezium Oracle connector but also to the others as well.
I’m pleased to announce the release of Debezium 1.6.0.Final!
This release is packed full with tons of new features, including support for incremental snapshotting that can be toggled using the new the Signal API. Based on the excellent paper DBLog: A Watermark Based Change-Data-Capture Framework by Netflix engineers Andreas Andreakis and Ioannis Papapanagiotou, the notion of incremental snapshotting addresses several requirements around snapshotting that came up repeatedly in the Debezium community:
It’s my pleasure to announce the release of Debezium 1.6.0.CR1!
This release adds skipped operations optimizations for SQL Server, introduces Heartbeat support to the Oracle connector, Oracle BLOB/CLOB support is now opt-in only, and provides a range of bug fixes and other improvements across different Debezium connectors.
It’s my pleasure to announce the release of Debezium 1.6.0.Beta2!
This release adds support for Pravega to Debezium Server, expands the snapshotting options of the Debezium Oracle connector, and provides a range of bug fixes and other improvements across different Debezium connectors.
Let me announce the bugfix release of Debezium 1.5, 1.5.2.Final!
This release is a rebuild of 1.5.1.Final using Java 8.
Let me announce the bugfix release of Debezium 1.5, 1.5.1.Final!
This release fixes a small set of issues discovered since the original release and few improvements into the documentation.
I’m pleased to announce the release of Debezium 1.6.0.Beta1!
This release introduces incremental snapshot support for SQL Server and Db2, performance improvements for SQL Server, support for BLOB/CLOB for Oracle, and much more. Lets take a few moments and explore some of these new features in the following.
It’s my pleasure to announce the first release of the Debezium 1.6 series, 1.6.0.Alpha1!
This release brings the brand new feature called incremental snapshots for MySQL and PostgreSQL connectors, a Kafka sink for Debezium Server, as well as a wide range of bug fixes and other small feature additions.
I’m thrilled to announce the release of Debezium 1.5.0.Final!
With Debezium 1.5, the LogMiner-based CDC implementation for Oracle moves from Incubating to Stable state, and there’s a brand-new implementation of the MySQL connector, which brings features like transaction metadata support. Other key features include support for a new "signalling table", which for instance can be used to implement schema changes with the Oracle connector, and support for
TRUNCATE events with Postgres. There’s also many improvements to the community-led connectors for Vitess and Apache Cassandra, as well as wide range of bug fixes and other smaller improvements.
It’s my pleasure to announce the release of Debezium 1.5.0.CR1!
As we begin moving toward finalizing the Debezium 1.5 release stream, the Oracle connector has been promoted to stable and there were some TLS improvements for the Cassandra connector, as well as numerous bugfixes. Overall, 50 issues have been addressed for this release.
We are very happy to announce the release of Debezium 1.5.0.Beta2!
The main features of this release is the new Debezium Signaling Table support, Vitess SET type support, and a continued focus to minor improvements, bugfixes, and polish as we sprint to the finish line for the 1.5 release.
Overall, the community fixed 54 issues since the Beta1 release, some of which we’ll explore more in-depth below.
I’m very happy to announce the release of Debezium 1.5.0.Beta1!
This release adds a brand-new component — the web-based Debezium UI --, transaction metadata support for the MySQL connector, a large number of improvements to the LogMiner-based capture implementation for the Debezium Oracle connector, support for Vitess 9.0, and much more. Let’s explore some of the new features in the following.
It’s my pleasure to announce the first release of the Debezium 1.5 series, 1.5.0.Alpha1!
This release brings many improvements to the LogMiner-based capture implementation for the Debezium Oracle connector, a large overhaul of the MySQL connector, as well as a wide range of bug fixes and other small feature additions.
I’m pleased to announce the release of Debezium 1.4.1.Final!
We highly recommend upgrading from 1.4.0.Final and earlier versions as this release includes bug fixes and enhancements to several Debezium connectors which includes some of the following:
I am pleased to announce the release of Debezium 1.4.0.Final!
This release concludes the major work put into Debezium over the last three months. Overall, the community fixed 117 issues during that time, including the following key features and changes:
Fine-grained selection of snapshotted tables
MySQL support for create or read records emitted during snapshot
Many Oracle Logminer adapter improvements
Full support for Oracle JDBC connection strings
Improved reporting of DDL errors
I’m pleased to announce the release of Debezium 1.4.0.CR1!
This release focuses primarily on polishing the 1.4 release.
I’m pleased to announce the release of Debezium 1.4.0.Beta1!
This release includes support for distributed tracing, lowercase table and schema naming for Db2, specifying MySQL snapshot records as create or read operations, and enhancements to Vitess for nullable and primary key columns.
I’m excited to announce the release of Debezium 1.4.0.Alpha2!
This second pass of the 1.4 release line provides a few useful new features:
New API hook for the PostgreSQL
Field renaming using
I’m excited to announce the release of Debezium 1.3.1.Final!
This release primarily focuses on bugs that were reported after the 1.3 release. Most importantly, the following bugs were fixed related to the Debezium connector for Oracle LogMiner adapter thanks to the continued feedback by the Debezium community.
I am excited to announce the release of Debezium 1.4.0.Alpha1!
This first pass of the 1.4 release line provides a few useful new features:
New Vitess connector
Allow fine-grained selection of snapshotted tables
Overall, the community fixed 41 issues for this release. Let’s take a closer look at some of the highlights.
It’s with great please that I’m announcing the release of Debezium 1.3.0.Final!
As per Debezium’s quarterly release cadence, this wraps up the work of the last three months. Overall, the community has fixed 138 issues during that time, including the following key features and changes:
A new incubating LogMiner-based implementation for ingesting change events from Oracle
Support for Azure Event Hubs in Debezium Server
Upgrade to Apache Kafka 2.6
Revised filter option names
A new SQL Server connector snapshot mode,
Support for database-filtered columns for SQL Server
Additional connection options for the MongoDB connector
ByteBufferConverterfor implementing the outbox pattern with Avro as the payload format
I’m very happy to announce the release of Debezium 1.3.0.CR1!
As we approach the final stretch of Debezium 1.3 Final, we took this opportunity to add delegate converter support for the
ByteBufferConverter and introduce a
debezium-scripting module. In addition, there’s also a range of bug fixes and quite a bit of documentation polish; overall, not less than 15 issues have been resolved for this release.
I’m very happy to announce the release of Debezium 1.3.0.Beta2!
In this release we’ve improved support for column filtering for the MySQL and SQL Server connectors, and there’s a brand-new implementation for ingesting change events from Oracle, using the LogMiner package. As we’re on the home stretch towards Debezium 1.3 Final, there’s also a wide range of smaller improvements, bug fixes and documentation clarifications; overall, not less than 44 issues have been resolved for this release.
It’s my pleasure to announce the release of Debezium 1.3.0.Beta1!
This release upgrades to the recently released Apache Kafka version 2.6.0, fixes several critical bugs and comes with a renaming of the connector configuration options for selecting the tables to be captured. We’ve also released Debezium 1.2.2.Final, which is a drop-in replacement for all users of earlier 1.2.x releases.
I’m excited to announce the release of Debezium 1.3.0.Alpha1!
This initial pass in the 1.3 release line provides a number of useful new features:
A new Debezium Server sink adapter for Azure Event Hubs
A new SQL Server connector snapshot mode,
Additional connection timeout options for the MongoDB Connector
Overall, the community fixed not less than 31 issues for this release. Let’s take a closer look at some of them in the remainder of this post.
I am happy to announce the release of Debezium 1.2.1.Final!
This release includes several bug fixes to different Debezium connectors, and we highly recommend the upgrade from 1.2.0.Final and earlier versions:
The Debezium Postgres connector may have missed events from concurrent transactions when transitioning from snapshotting to streaming events from the WAL (DBZ-2288); this is fixed now when using the exported snapshotting mode; this mode should preferably be used, and for Debezium 1.3 we’re planning for this to be the basis for all the existing snapshotting modes
The Postgres JDBC driver got upgraded to 42.2.14 (DBZ-2317), which fixes a CVE in the driver related to processing XML column values sourced from untrusted XML input
The MySQL connector automatically filters out specific DML binlog entries from internal tables when using it with Amazon RDS (DBZ-2275)
The Debezium MongoDB connector got more resilient against connection losses (DBZ-2141)
I’m very happy to announce the release of Debezium 1.2.0.Final!
Over the last three months, the community has resolved nearly 200 issues. Key features of this release include:
Support for SMTs and message converters in the Debezium embedded engine
Debezium Server, a brand-new runtime which allows to propagate data change events to a range of messaging infrastructures like Amazon Kinesis, Google Cloud Pub/Sub, and Apache Pulsar
A new column masking mode "consistent hashing", allowing to anonymize column values while still keeping them correlatable
New metrics for the MongoDB connector
Improved re-connect capability for the SQL Server connector
It’s my pleasure to announce the release of Debezium 1.2.0.CR1!
This release includes several notable features, enhancements, and fixes:
PostgreSQL can restrict the set of tables with a publication while using pgoutput (DBZ-1813).
Metrics MBean registration is skipped if a platform MBean server does not exist (DBZ-2089).
SQL Server reconnection improved during shutdown and connection resets (DBZ-2106).
EventRouter SMT can now pass non-String based keys (DBZ-2152).
include.unknown.datatypescan now return strings rather than hashes (DBZ-1266).
Debezium Server now supports Google Cloud PubSub (DBZ-2092).
Debezium Server now supports Apache Pulsar sink (DBZ-2112).
You can find the complete list of addressed issues, upgrade procedures, and notes on any backward compatibility changes in the release notes.
Many thanks to all the community members contributing to this release: Andy Teijelo Pérez, Balázs Németh, Bingqin Zhou, Brandon Brown, cobolbaby, Dave Cumberland, Ed Laur, Emmanuel Brard, Fabian Aussems, Ivan Trusov, Justin Hiza, Jeremy Finzel, Kewei Shang, Lukas Krejci, and Robert B. Hanviriyapunt.
I’m very happy to share the news that Debezium 1.2.0.Beta2 has been released!
Core feature of this release is Debezium Server, a dedicated stand-alone runtime for Debezium, opening up its open-source change data capture capabilities towards messaging infrastructure like Amazon Kinesis.
Overall, the community has fixed 25 issues since the Beta1 release, some of which we’re going to explore in more depth in the remainder of this post.
With great happiness I’m announcing the release of Debezium 1.2.0.Beta1!
This release brings user-facing schema change topics for the SQL Server, Db2 and Oracle connectors, a new message transformation for content-based change event routing, support for a range of array column types in Postgres and much more. We also upgraded the Debezium container images for Apache Kafka and Kafka Connect to version 2.5.0.
As it’s the answer to all questions in life, the number of issues fixed for this release is exactly 42!
I’m very happy to announce the release of Debezium 1.2.0.Alpha1!
This first drop of the 1.2 release line provides a number of useful new features:
Support for message transformations (SMTs) and converters in the Debezium embedded engine API
A new SMT for filtering out change events using scripting languages
Automatic reconnects for the SQL Server connector
A new column masking mode using consistent hash values
Overall, the community fixed not less than 41 issues for this release. Let’s take a closer look at some of them in the remainder of this post.
It’s with great excitement that I’m announcing the release of Debezium 1.1.0.Final!
About three months after the 1.0 release, this new version comes with many exciting new features such as:
a Quarkus extension facilitating the outbox pattern
support for the CloudEvents specification
an incubating connector for the IBM Db2 database
transaction marker events
support for CDC integration testing via Testcontainers
It’s my pleasure to announce the release of Debezium 1.1.0.CR1!
This release brings a brand-new API module, including a facility for overriding the schema and value conversion of specific columns. The Postgres connector gained the ability to reconnect to the database after a connection loss, and the MongoDB connector supports the metrics known from other connectors now.
It’s my pleasure to announce the release of Debezium 1.1.0.Beta1!
This release adds support for transaction marker events, an incubating connector for the IBM Db2 database as well as a wide range of bug fixes. As the 1.1 release still is under active development, we’ve backported an asorted set of bug fixes to the 1.0 branch and released Debezium 1.0.1.Final, too.
At the time of writing this, not all connector archives have been synched to Maven Central yet; this should be the case within the next few others.
Did you know January 16th is National Nothing Day? It’s the one day in the year without celebrating, observing or honoring anything.
Well, normally, that is. Because we couldn’t stop ourselves from sharing the news of the Debezium 1.1.0.Alpha1 release with you! It’s the first release after Debezium 1.0, and there are some really useful features coming with it. Let’s take a closer look.
Today it’s my great pleasure to announce the availability of Debezium 1.0.0.Final!
Since the initial commit in November 2015, the Debezium community has worked tirelessly to realize the vision of building a comprehensive open-source low-latency platform for change data capture (CDC) for a variety of databases.
Within those four years, Debezium’s feature set has grown tremendously: stable, highly configurable CDC connectors for MySQL, Postgres, MongoDB and SQL Server, incubating connectors for Apache Cassandra and Oracle, facilities for transforming and routing change data events, support for design patterns such as the outbox pattern and much more. A very active and welcoming community of users, contributors and committers has formed around the project. Debezium is deployed to production at lots of organizations from all kinds of industries, some with huge installations, using hundreds of connectors to stream data changes out of thousands of databases.
The 1.0 release marks an important milestone for the project: based on all the production feedback we got from the users of the 0.x versions, we figured it’s about time to express the maturity of the four stable connectors in the version number, too.
When a Debezium connector is deployed to a Kafka Connect instance it is sometimes necessary to keep database credentials hidden from other users of the Connect API.
Let’s remind how a connector registration request looks like for the MySQL Debezium connector:
Did you know December 12th is National Ding-a-Ling Day? It’s the day to call old friends you haven’t heard from in a while. So we thought we’d get in touch (not that is has been that long) with our friends, i.e. you, and share the news about the release of Debezium 1.0.0.CR1!
It’s the first, and ideally only, candidate release; so Debezium 1.0 should be out very soon. Quite a few nice features found their way into CR1:
Graceful handling of MongoDB 4.0 transaction events (DBZ-1215)
While fall weather is in full swing, the Debezium community is not letting the unusually low, frigid temperatures get the best of us. It is my pleasure to announce the release of Debezium 1.0.0.Beta3!
This new Debezium release includes several notable new features, enhancements, and fixes:
Built against Kafka Connect 2.3.1 (DBZ-1612)
drop_on_stopconfiguration parameter to
Standardized source information for Cassandra connector (DBZ-1408)
Propagate MongoDB replicator exceptions so they are visible from Kafka Connect’s status endpoint (DBZ-1583)
Envelope methods should accept
longvalues for timestamps (DBZ-1607)
Erroneously reporting no tables captured (DBZ-1519)
Avoid Oracle connector attempting to analyze tables (DBZ-1569)
Toasted columns should contain
nullin before rather than
Support PostgreSQL 11+
PostgreSQL connector times out in schema discovery for databases with many tables (DBZ-1579)
ts_msis not correct duing snapshot processing (DBZ-1588)
Heartbeats are not generated for non-whitelisted tables (DBZ-1592)
It is my pleasure to announce the release of Debezium 1.0.0.Beta2!
This new Debezium release includes several notable new features, enhancements, and fixes:
LTREEcolumns with a logical data type (DBZ-1336)
Support for PostgreSQL 12 (DBZ-1542)
Validate configured PostgreSQL replication slot not contains no invalid characters (DBZ-1525)
Add MySQL DDL parser support for index creation
Add MySQL DDL parser support for granting
collectionsource struct field when collection name contains a dot (DBZ-1563)
Close idle transactions after performing a PostgreSQL snapshot (DBZ-1564)
History is in the making as Debezium begins to sprint to its 1.0 milestone. It’s my pleasure to announce the release of Debezium 1.0.0.Beta1!
This new Debezium release includes several notable new features, enhancements, and fixes:
ExtractNewDocumentState and EventRouter SMTs propagate heartbeat & schema change messages (DBZ-1513)
Provides alternative mapping for
Ensure message keys have the right column order (DBZ-1507)
Warn of table locking problems in connector logs (DBZ-1280)
On behalf of the Debezium community it’s my great pleasure to announce the release of Debezium 0.10.0.Final!
As you’d expect it, there were not many changes since last week’s CR2, one exception being a performance fix for the
pgoutput plug-in of the Postgres connector, which may have suffered from slow processing when dealing with many small transactions in a short period of time (DBZ-1515).
This release finalizes the work of overall eight preview releases. We have discussed the new features and changes in depth in earlier announcements, but here are some highlights of Debezium 0.10:
I’m very happy to announce the release of Debezium 0.10.0.CR2!
After the CR1 release we decided to do another candidate release, as there was not only a good number of bug fixes coming in, but also a few very useful feature implementations were provided by the community, which we didn’t want to delay. So we adjusted the original plan a bit and now aim for Debezium 0.10 Final in the course of next week, barring any unforeseen regressions.
As usual, let’s take a closer look at some of the new features and resolved bugs.
The Debezium community is on the homestretch towards the 0.10 release and we’re happy to announce the availability of Debezium 0.10.0.CR1!
Besides a number of bugfixes to the different connectors, this release also brings a substantial improvement to the way initial snapshots can be done with Postgres. Unless any major regressions show up, the final 0.10 release should follow very soon.
The temperatures are slowly cooling off after the biggest summer heat, an the Debezium community is happy to announce the release of Debezium 0.10.0.Beta4. In this release we’re happy to share some news we don’t get to share too often: with Apache Cassandra, another database gets added to the list of databases supported by Debezium!
In addition, we finished our efforts for rebasing the existing Postgres connector to Debezium framework structure established for the SQL Server and Oracle connectors. This means more shared coded between these connectors, and in turn reduced maintenance efforts for the development team going forward; but there’s one immediately tangible advantage for you coming with this, too: the Postgres connector now exposes the same metrics you already know from the other connectors.
Finally, the new release contains a range of bugfixes and other useful improvements. Let’s explore some details below.
The summer is at its peak but Debezium community is not relenting in its effort so the Debezium 0.10.0.Beta3 is released.
This version not only continues in incremental improvements of Debezium but also brings new shiny features.
All of you who are using PostgreSQL 10 and higher as a service offered by different cloud providers definitely felt the complications when you needed to deploy logical decoding plugin necessary to enable streaming. This is no longer necessary. Debezium now supports (DBZ-766) pgoutput replication protocol that is available out-of-the-box since PostgreSQL 10.
It’s my pleasure to announce the release of Debezium 0.10.0.Beta2!
This further stabilizes the 0.10 release line, with lots of bug fixes to the different connectors. 23 issues were fixed for this release; a couple of those relate to the DDL parser of the MySQL connector, e.g. around
RENAME INDEX (DBZ-1329),
SET NEW in triggers (DBZ-1331) and function definitions with the
COLLATE keyword (DBZ-1332).
For the Postgres connector we fixed a potential inconsistency when flushing processed LSNs to the database (DBZ-1347). Also the "include.unknown.datatypes" option works as expected now during snapshotting (DBZ-1335) and the connector won’t stumple upon materialized views during snapshotting any longer (DBZ-1345).
Another week, another Debezium release — I’m happy to announce the release of Debezium 0.10.0.Beta1!
A very welcomed usability improvement is that the connectors will log a warning now if not at least one table is actually captured as per the whitelist/blacklist configuration (DBZ-1242). This helps to prevent the accidental exclusion all tables by means of an incorrect filter expression, in which case the connectors "work as intended", but no events are propagated to the message broker.
Please see the release notes for the complete list of issues fixed in this release. Also make sure to examine the upgrade guidelines for 0.10.0.Alpha1 and Alpha2 when upgrading from earlier versions.
Release early, release often — Less than a week since the Alpha1 we are announcing the release of Debezium 0.10.0.Alpha2!
This is an incremental release that completes some of the tasks started in the Alpha1 release and provides a few bugfixes and also quality improvements in our Docker images.
I’m very happy to announce the release of Debezium 0.10.0.Alpha1!
The major theme for Debezium 0.10 will be to do some clean-up (that’s what you do at this time of the year, right?); we’ve planned to remove a few deprecated features and to streamline some details in the structure the CDC events produced by the different Debezium connectors.
This means that upgrading to Debezium 0.10 from earlier versions might take a bit more planning and consideration compared to earlier upgrades, depending on your usage of features and options already marked as deprecated in 0.9 and before. But no worries, we’re describing all changes in great detail in this blog post and the release notes.
It’s my pleasure to announce the release of Debezium 0.9.5.Final!
This is a recommended update for all users of earlier versions; besides bug fixes also a few new features are provide. The release contains 18 resolved issues overall.
It’s my pleasure to announce the release of Debezium 0.9.4.Final!
This is a drop-in replacement for earlier Debezium 0.9.x versions, containing mostly bug fixes and some improvements related to metrics. Overall, 17 issues were resolved.
The Debezium team is happy to announce the release of Debezium 0.9.3.Final!
This is mostly a bug-fix release and a drop-in replacement for earlier Debezium 0.9.x versions, but there are few significant new features too. Overall, 17 issues were resolved.
|Container images will be released with a small delay due to some Docker Hub configuration issues.|
The Debezium team is happy to announce the release of Debezium 0.9.2.Final!
This is mostly a bug-fix release and a drop-in replacement for earlier Debezium 0.9.x versions. Overall, 18 issues were resolved.
A couple of fixes relate to the Debezium Postgres connector:
Quickly following up to last week’s release of Debezium 0.9, it’s my pleasure today to announce the release of Debezium 0.9.1.Final!
This release fixes a couple of bugs which were reported after the 0.9 release. Most importantly, there are two fixes to the new Debezium connector for SQL Server, which deal with correct handling of LSNs after connector restarts (DBZ-1128, DBZ-1131). The connector also uses more reasonable defaults for the
fetchSize options of the SQL Server JDBC driver (DBZ-1065), which can help to significantly increase through-put and reduce memory consumption of the connector.
I’m delighted to announce the release of Debezium 0.9 Final!
This release only adds a small number of changes since last week’s CR1 release; most prominently there’s some more metrics for the SQL Server connector (lag behind master, number of transactions etc.) and two bug fixes related to the handling of partitioned tables in MySQL (DBZ-1113) and Postgres (DBZ-1118).
Having been in the works for six months after the initial Alpha release, Debezium 0.9 comes with a brand new connector for SQL Server, lots of new features and improvements for the existing connectors, updates to the latest versions of Apache Kafka and the supported databases as well as a wide range of bug fixes.
Reaching the home stretch towards Debezium 0.9, it’s with great pleasure that I’m announcing the first release of Debezium in 2019, 0.9.0.CR1!
For this release we’ve mainly focused on sorting out remaining issues in the Debezium connector for SQL Server; the connector comes with greatly improved performance and has received a fair number of bug fixes.
Other changes include a new interface for event handlers of Debezium’s embedded engine, which allows for bulk handling of change events, an option to export the scale of numeric columns as schema parameter, as well as a wide range of bug fixes for the Debezium connectors for MySQL, Postgres and Oracle.
With only a few days left for the year, it’s about time for another Debezium release; so it’s with great pleasure that I’m announcing Debezium 0.9.0.Beta2!
This release comes with support for MySQL 8 and Oracle 11g; it includes a first cut of metrics for monitoring the SQL Server and Oracle connectors, several improvements to the MongoDB event flattening SMT as well as a wide range of bug fixes. Overall, not less than 42 issues were addressed; very clearly, there has to be some deeper sense in that ;)
A big shout out goes to the following members Debezium’s amazing community, who contributed to this release: Eero Koplimets, Grzegorz Kołakowski, Hanlin Liu, Lao Mei, Renato Mefi, Tautvydas Januskevicius, Wout Scheepers and Zheng Wang!
In the following, let’s take a closer look at some of the changes coming with the 0.9 Beta2 release.
It’s my pleasure to announce the release of Debezium 0.9.0.Beta1! Oh, and to those of you who are celebrating it — Happy Thanksgiving!
This new Debezium release comes with several great improvements to our work-in-progress SQL Server connector:
Initial snapshots can be done using the
snapshotisolation level if enabled in the DB (DBZ-941)
Changes to the structures of captured tables after the connector has been set up are supported now (DBZ-812)
It’s my pleasure to announce the release of Debezium 0.9.0.Alpha2!
While the work on the connectors for SQL Server and Oracle continues, we decided to do another Alpha release, as lots of fixes and new features - many of them contributed by community members - have piled up, which we wanted to get into your hands as quickly as possible.
This release supports Apache Kafka 2.0, comes with support for Postgres' HSTORE column type, allows to rename and filter fields from change data messages for MongoDB and contains multiple bug fixes and performance improvements. Overall, this release contains 55 fixes (note that a few of these have been merged back to 0.8.x and are contained in earlier 0.8 releases, too).
As temperatures are cooling off, the Debezium team is getting into full swing again and we’re happy to announce the release of Debezium 0.8.3.Final!
This is a bugfix release to the current stable release line of Debezium, 0.8.x, while the work on Debezium 0.9 goes on in parallel. There are 14 fixes in this release. As in earlier 0.8.x releases, we’ve further improved the new Antlr-based DDL parser used by the MySQL connector (see DBZ-901, DBZ-903 and DBZ-910).
The Postgres connector saw a huge improvement to its start-up time for databases with lots of custom types (DBZ-899). The user reporting this issue had nearly 200K entries in pg_catalog.pg_type, and due to an N + 1 SELECT issue within the Postgres driver itself, this caused the connector to take 24 minutes to start. By using a custom query for obtaining the type metadata, we were able to cut down this time to 5 seconds! Right now we’re working with the maintainers of the Postgres driver to get this issue fixed upstream, too.
The Debezium team is back from summer holidays and we’re happy to announce the release of Debezium 0.8.2!
This is a bugfix release to the current stable release line of Debezium, 0.8.x, while the work on Debezium 0.9 is continuing.
Note: By accident the version of the release artifacts is 0.8.2 instead of 0.8.2.Final. This is not in line with our recently established convention of always letting release versions end with qualifiers such as Alpha1, Beta1, CR1 or Final. The next version in the 0.8 line will be 0.8.3.Final and we’ll improve our release pipeline to make sure that this situation doesn’t occur again.
The 0.8.2 release contains 10 fixes overall, most of them dealing with issues related to DDL parsing as done by the Debezium MySQL connector. For instance, implicit non-nullable primary key columns will be handled correctly now using the new Antlr-based DDL parser (DBZ-860). Also the MongoDB connector saw a bug fix (DBZ-838): initial snapshots will be interrupted now if the connector is requested to stop (e.g. when shutting down Kafka Connect). More a useful improvement rather than a bug fix is the Postgres connector’s capability to add the table, schema and database names to the
source block of emitted CDC events (DBZ-866).
I’m very happy to announce the release of Debezium 0.8.0.Final!
The key features of Debezium 0.8 are the first work-in-progress version of our Oracle connector (based on the XStream API) and a brand-new parser for MySQL DDL statements. Besides that, there are plenty of smaller new features (e.g. propagation of default values to corresponding Connect schemas, optional propagation of source queries in CDC messages and a largely improved SMT for sinking changes from MongoDB into RDBMS) as well as lots of bug fixes (e.g. around temporal and numeric column types, large transactions with Postgres).
Please see the previous announcements (Beta 1, CR 1) to learn about all the changes in more depth. The Final release largely resembles CR1; apart from further improvements to the Oracle connector (DBZ-792) there’s one nice addition to the MySQL connector contributed by Peter Goransson: when doing a snapshot, it will now expose information about the processed rows via JMX (DBZ-789), which is very handy when snapshotting larger tables.
Please take a look at the change log for the complete list of changes in 0.8.0.Final and general upgrade notes.
A fantastic Independence Day to all the Debezium users in the U.S.! But that’s not the only reason to celebrate: it’s also with great happiness that I’m announcing the release of Debezium 0.8.0.CR1!
Following our new release scheme, the focus for this candidate release of Debezium 0.8 has been to fix bug reported for last week’s Beta release, accompanied by a small number of newly implemented features.
Thanks a lot to everyone testing the new Antlr-based DDL parser for the MySQL connector; based on the issues you reported, we were able to fix a few bugs in it. As announced recently, for 0.8 the legacy parser will remain the default implementation, but you are strongly encouraged to test out the new one (by setting the connector option
antlr) and report any findings you may have. We’ve planned to switch to the new implementation by default in Debezium 0.9.
It’s with great excitement that I’m announcing the release of Debezium 0.8.0.Beta1!
This release brings many exciting new features as well as bug fixes, e.g. the first drop of our new Oracle connector, a brand new DDL parser for the MySQL connector, support for MySQL default values and the update to Apache Kafka 1.1.
Due to the big number of changes (the release contains exactly 42 issues overall), we decided to alter our versioning schema a little bit: going forward we may do one or more Beta and CR ("candidate release") releases before doing a final one. This will allow us to get feedback from the community early on, while still completing and polishing specific features. Final (stable) releases will be named like 0.8.0.Final etc.
Last updated at Nov 21st 2018 (adjusted to new KSQL Docker images).
Last year we have seen the inception of a new open-source project in the Apache Kafka universe, KSQL, which is a streaming SQL engine build on top of Kafka Streams. In this post, we are going to try out KSQL querying with data change events generated by Debezium from a MySQL database.
It’s my pleasure to announce the release of Debezium 0.7.5!
This is a bugfix release to the 0.7 release line, which we decided to do while working towards Debezium 0.8. Most notably it fixes an unfortunate bug introduced in 0.7.3 (DBZ-663), where the internal database history topic of the Debezium MySQL connector could be partly deleted under some specific conditions. Please see the dedicated blog post on this issue to find out whether this affects you and what you should do to prevent this issue.
Together with this, we released a couple of other fixes and improvements. Thanks to Maciej Brynski, the performance of the logical table routing SMT has been improved significantly (DBZ-655). Another fix contributed by Maciej is for DBZ-646 which lets the MySQL connector handle
CREATE TABLE statements for the TokuDB storage engine now.
A user of the Debezium connector for MySQL informed us about a potential issue with the configuration of the connector’s internal database history topic, which may cause the deletion of parts of that topic (DBZ-663). Please continue reading if you’re using the Debezium MySQL connector in versions 0.7.3 or 0.7.4.
It’s my pleasure to announce the release of Debezium 0.7.4!
Continuing the 0.7 release line, this new version brings several bug fixes and a handful of new features. We recommend this upgrade to all users. When upgrading from earlier versions, please check out the release notes of all versions between the one you’re currently on and 0.7.4 in order to learn about any steps potentially required for upgrading.
I’m very happy to announce the release of Debezium 0.7.3!
This is primarily a bugfix release, but we’ve also added a handful of smaller new features. It’s a recommended upgrade for all users. When upgrading from earlier versions, please check out the release notes of all versions between the one your’re currently on and 0.7.3 in order to learn about any steps potentially required for upgrading.
Let’s take a closer look at some of the new features.
It’s my pleasure to announce the release of Debezium 0.7.2!
Amongst the new features there’s support for geo-spatial types, a new snapshotting mode for recovering a lost DB history topic for the MySQL connector, and a message transformation for converting MongoDB change events into a structure which can be consumed by many more sink connectors. And of course we fixed a whole lot of bugs, too.
Debezium 0.7.2 is a drop-in replacement for previous 0.7.x versions. When upgrading from versions earlier than 0.7.0, please check out the release notes of all 0.7.x releases to learn about any steps potentially required for upgrading.
Now let’s take a closer look at some of new features.
We wish all the best to the Debezium community for 2018!
While we’re working on the 0.7.2 release, we thought we’d publish another post describing an end-to-end data streaming use case based on Debezium. We have seen how to set up a change data stream to a downstream database a few weeks ago. In this blog post we will follow the same approach to stream the data to an Elasticsearch server to leverage its excellent capabilities for full-text search on our data. But to make the matter a little bit more interesting, we will stream the data to both, a PostgreSQL database and Elasticsearch, so we will optimize access to the data via the SQL query language as well as via full-text search.
Just last few days before Christmas we are releasing Debezium 0.7.1! This is a bugfix release that fixes few annoying issues that were found during first rounds of use of Debezium 0.7 by our community. All issues relate to either newly provided wal2json support or reduced risk of internal race condition improvement.
Suraj Savita (and others) has found an issue when our code failed to correctly detect it runs with Amazon RDS wal2json plug-in. We are outsmarted by the JDBC driver internals and included a distinct plugin decoder name wal2json_rds that bypasses detection routine and by default expects it runs against Amazon RDS instance. This mode should be used only with RDS instances.
We have also gathered feedback from first tries to run with Amazon RDS and included a short section in our documentation on this topic.
It’s not Christmas yet, but we already got a present for you: Debezium 0.7.0 is here, full of new features as well as many bug fixes! A big thank you goes out to all the community members who contributed to this release. It is very encouraging for us to see not only more and more issues and feature requests being reported, but also pull requests coming in.
Note that this release comes with a small number of changes to the default mappings for some data types. We try to avoid this sort of changes as far as possible, but in some cases it is required, e.g. if the previous mapping could have caused potential value losses. Please see below for the details and also make sure to check out the full change log which describes these changes in detail.
Now let’s take a closer look at some of new features.
We are accelerating! Three weeks after the 0.6.1 release, the Debezium team is bringing Debezium 0.6.2 to you!
This release revolves mostly around bug fixes, but there are a few new features, too. Let’s take a closer look at some of the changes.
Just shy of a month after the 0.6.0 release, I’m happy to announce the release of Debezium 0.6.1!
This release contains several bugfixes, dependency upgrades and a new option for controlling how
BIGINT UNSIGNED columns are conveyed. We also expanded the set of Docker images and Docker Compose files accompanying our tutorial, so you can run it now with all the databases we support.
Let’s take a closer look at some of the changes.
In this blog post we will create a simple streaming data pipeline to continuously capture the changes in a MySQL database and replicate them in near real-time into a PostgreSQL database. We’ll show how to do this without writing any code, but instead by using and configuring Kafka Connect, the Debezium MySQL source connector, the Confluent JDBC sink connector, and a few single message transforms (SMTs).
This approach of replicating data through Kafka is really useful on its own, but it becomes even more advantageous when we can combine our near real-time streams of data changes with other streams, connectors, and stream processing applications. A recent Confluent blog post series shows a similar streaming data pipeline but using different connectors and SMTs. What’s great about Kafka Connect is that you can mix and match connectors to move data between multiple systems.
What’s better than getting Java 9? Getting Java 9 and a new version of Debezium at the same time! So it’s with great happiness that I’m announcing the release of Debezium 0.6 today.
I’m very happy to announce the release of Debezium 0.5.2!
decimal.handling.modeoption already known from the MySQL connector is now also supported for PostgreSQL (DBZ-337). It lets you control how
DECIMALcolumns are represented in change events (either using Kafka’s
Decimaltype or as
The MongoDB connector supports the options
It’s my pleasure to announce the release of Debezium 0.5.1!
This release fixes several bugs in the MySQL, Postgres and MongoDB connectors. There’s also support for some new datatypes:
POINT on MySQL (DBZ-222) and
TSTZRANGE on Postgres (DBZ-280). This release is a drop-in replacement for 0.5.0, upgrading is recommended to all users.
Note that in the — rather unlikely — case that you happened to enable Debezium for all the system tables of MySQL, any configured table filters will be applied to these system tables now, too (DBZ-242). This may require an adjustment of your filters if you indeed wanted to capture all system tables but only selected non-system tables.
We’re happy to announce that Debezium 0.5.0 is now available for use with Kafka Connect 0.10.2.0. This release also includes a few fixes for the MySQL connector. See the release notes for specifics on these changes, and be sure to check out the Kafka documentation for compatibility with the version of the Kafka broker that you are using.
Kafka Connect 0.10.2.0 comes with a significant new feature called Single Message Transforms, and you can now use them with Debezium connectors. SMTs allow you to modify the messages produced by Debezium connectors and any oher Kafka Connect source connectors, before those messages are written to Kafka. SMTs can also be used with Kafka Connect sink connectors to modify the messages before the sink connectors processes them. You can use SMTs to filter out or mask specific fields, add new fields, modify existing fields, change the topic and/or topic partition to which the messages are written, and even more. And you can even chain multiple SMTs together.
Kafka Connect comes with a number of built-in SMTs that you can simply configure and use, but you can also create your own SMT implementations to do more complex and interesting things. For example, although Debezium connectors normally map all of the changes in each table (or collection) to separate topics, you can write a custom SMT that uses a completely different mapping between tables and topics and even add fields to message keys and/or values. Using your new SMT is also very easy - simply put it on the Kafka Connect classpath and update the connector configuration to use it.
Thanks to Sanjay and everyone in the community for their help with this release, issues, discussions, contributions, and questions!
We’re happy to announce that Debezium 0.4.1 is now available for use with Kafka Connect 0.10.1.1. This release includes several fixes for the MongoDB connector and MySQL connector, including improved support for Amazon RDS and Amazon Aurora (MySQL compatibility). See the release notes for specifics on these changes.
Thanks to Jan, Horia, David, Josh, Johan, Sanjay, Saulius, and everyone in the community for their help with this release, issues, discussions, contributions, and questions!
This post originally appeared on the WePay Engineering blog.
Change data capture has been around for a while, but some recent developments in technology have given it new life. Notably, using Kafka as a backbone to stream your database data in realtime has become increasingly common.
If you’re wondering why you might want to stream database changes into Kafka, I highly suggest reading The Hardest Part About Microservices: Your Data. At WePay, we wanted to integrate our microservices and downstream datastores with each other, so every system could get access to the data that it needed. We use Kafka as our data integration layer, so we needed a way to get our database data into it.
Last year, Yelp’s engineering team published an excellent series of posts on their data pipeline. These included a discussion on how they stream MySQL data into Kafka. Their architecture involves a series of homegrown pieces of software to accomplish the task, notably schematizer and MySQL streamer. The write-up triggered a thoughtful post on Debezium’s blog about a proposed equivalent architecture using Kafka connect, Debezium, and Confluent’s schema registry. This proposed architecture is what we’ve been implementing at WePay, and this post describes how we leverage Debezium and Kafka connect to stream our MySQL databases into Kafka.
We’re happy to announce that Debezium 0.4.0 is now available for use with Kafka Connect 0.10.1.1. This release introduces a new PostgreSQL connector, and contains over a dozen fixes combined for the MongoDB connector and MySQL connector, including preliminar support for Amazon RDS and Amazon Aurora (MySQL compatibility). See the release notes for specifics on these changes.
Thanks to Horia, Chris, Akshath, Ramesh, Matthias, Anton, Sagi, barton, and others for their help with this release, issues, discussions, contributions, and questions!
We’re happy to announce that Debezium 0.3.6 is now available for use with Kafka Connect 0.10.0.1. This release contains over a dozen fixes combined for the MySQL connector and MongoDB connectors. See the release notes for specifics on these changes.
Thanks to Farid, RenZhu, Dongjun, Anton, Chris, Dennis, Sharaf, Rodrigo, Tim, and others for their help with this release, issues, discussions, contributions, and questions!
We’re happy to announce that Debezium 0.3.5 is now available for use with Kafka Connect 0.10.0.1. This release contains several fixes for the MySQL connector and adds the ability to use with multi-master MySQL servers as sources. See the release notes for specifics on these changes. We’ve also updated the Debezium Docker images labelled
latest, which we use in our tutorial.
One of the fixes is signficant, and so we strongly urge all users to upgrade to this release from all earlier versions. In prior versions, the MySQL connector may stop without completing all updates in a transaction, and when the connector restarts it starts with the next transaction and therefore might fail to capture some of the change events in the earlier transaction. This release fixes this issue so that when restarting it will always pick up where it left off, even if that point is in the middle of a transaction. Note that this fix only takes affect once a connector is upgraded and restarted. See the issue for more details.
Thanks to Akshath, Anton, Chris, and others for their help with the release, issues, discussions, contributions, and questions!
We’re happy to announce that Debezium 0.3.4 is now available for use with Kafka Connect 0.10.0.1. This release contains several new features for the MySQL connector: support for MySQL’s
JSON datatype, a new snapshot mode called
schema_only, and JMX metrics. Also, the Debezium Docker images for Zookeeper, Kafka, and Kafka Connect have all been updated to allow optionally expose JMX metrics in these services. And, one backward-incompatible fix was made to the change event’s
ts_sec field. See the release notes for specifics.
Thanks to Akshath, Chris, Vitalii, Dennis, Prannoy, and others for their help with the release, issues, discussions, contributions, and questions!
MySQL 5.7 introduced a new data type for storing and working with JSON data. Clients can define tables with columns using the new
JSON datatype, and they can store and read JSON data using SQL statements and new built-in JSON functions to construct JSON data from other relational columns, introspect the structure of JSON values, and search within and manipulate JSON data. It possible to define generated columns on tables whose values are computed from the JSON value in another column of the same table, and to then define indexes with those generated columns. Overall, this is really a very powerful feature in MySQL.
Debezium’s MySQL connector will support the
JSON datatype starting with the upcoming 0.3.4 release. JSON document, array, and scalar values will appear in change events as strings with
io.debezium.data.json for the schema name. This will make it natural for consumers to work with JSON data. BTW, this is the same semantic schema type used by the MongoDB connector to represent JSON data.
This sounds straightforward, and we hope it is. But implementing this required a fair amount of work. That’s because although MySQL exposes JSON data as strings to client applications, internally it stores all JSON data in a special binary form that allows the MySQL engine to efficiently access the JSON data in queries, JSON functions and generated columns. All JSON data appears in the binlog in this binary form as well, which meant that we had to parse the binary form ourselves if we wanted to extract the more useful string representation. Writing and testing this parser took a bit of time and effort, and ultimately we donated it to the excellent MySQL binlog client library that the connector uses internally to read the binlog events.
We’re happy to announce that Debezium 0.3.3 is now available for use with Kafka Connect 0.10.0.1. This release contains a handful of bug fixes and minor improvements for the MySQL connector, including better handling of
SET values, and GTID sets, This release also improves the log messages output by the MySQL connectors to better represent the ongoing activity when consuming the changes from the source database. See the release notes for specifics.
We’ve also updated the Debezium Docker images labelled
latest, which we use in our tutorial. We’ve also updated the tutorial to use the latest Docker installations on Linux, Windows, and OS X.
Thanks to Akshath, Chris, Randy, Prannoy, Umang, Horia, and others for their help with the release, issues, discussions, contributions, and questions!
We’re happy to announce that Debezium 0.3.2 is now available for use with Kafka Connect 0.10.0.1. This release contains a handful of bug fixes and minor improvements for the MySQL connector and MongoDB connector. The MySQL connector better handles
BIT(n) values and zero-value date and timestamp values. This release also improves the log messages output by the MySQL and MongoDB connectors to better represent the ongoing activity when consuming the changes from the source database. See the release notes for specifics.
We’ve also updated the Debezium Docker images labelled
latest, which we use in our tutorial. We’ve also updated the tutorial to use the latest Docker installations on Linux, Windows, and OS X.
Thanks to Akshath, Colum, Emmanuel, Konstantin, Randy, RenZhu, Umang, and others for their help with the release, issues, discussions, contributions, and questions!
We’re happy to announce that Debezium 0.3.1 is now available for use with Kafka Connect 0.10.0.1. This release contains an updated MySQL connector with a handful of bug fixes and two significant but backward-compatible changes. First, the MySQL connector now supports using secure connections to MySQL, adding to the existing ability to connect securely to Kafka. Second, the MySQL connector is able to capture MySQL string values using the proper character sets so that any values stored in the database can be captured correctly in events. See our release notes for details of these changes and for upgrading recommendations.
Thanks to Chris, Akshath, barten, and and others for their help with the release, issues, discussions, contributions, and questions!
After a few weeks delay, Debezium 0.3.0 is now available for use with Kafka Connect 0.10.0.1. This release contains an updated MySQL connector with quite a few bug fixes, and a new MongoDB connector that captures the changes made to a MongoDB replica set or MongoDB sharded cluster. See the documentation for details about how to configure these connectors and how they work.
Thanks to Andrew, Bhupinder, Chris, David, Horia, Konstantin, Tony, and others for their help with the release, issues, discussions, contributions, and questions!
I’m happy to announce that Debezium 0.2.4 is now available for use with Kafka Connect 0.9.0.1. This release adds more verbose logging during MySQL snapshots, enables taking snapshots of very large MySQL databases, and correct a potential exception during graceful shutdown. See our release notes for details of these changes and for upgrading recommendations.
Thanks to David and wangshao for their help with the release, issues, discussions, contributions, and questions! Stay tuned for our next release, which will be 0.3 and will have a new MongoDB connector and will support Kafka Connect 0.10.0.1.
Change data capture is a hot topic. Debezium’s goal is to make change data capture easy for multiple DBMSes, but admittedly we’re still a young open source project and so far we’ve only released a connector for MySQL with a connector for MongoDB that’s just around the corner. So it’s great to see how others are using and implementing change data capture. In this post, we’ll review Yelp’s approach and see how it is strikingly similar to Debezium’s MySQL connector.
I’m happy to announce that Debezium 0.2.3 is now available for use with Kafka Connect 0.9.0.1. This release corrects the MySQL connector behavior when working with
SMALLINT columns or with
TIMESTAMP columns. See our release notes for details of these changes and for upgrading recommendations.
Thanks to Chris, Christian, Laogang, and Tony for their help with the release, issues, discussions, contributions, and questions! Stay tuned for our next release, which will be 0.3 and will have a new MongoDB connector and will support Kafka Connect 0.10.0.0.
I’m happy to announce that Debezium 0.2.2 is now available. This release fixes several bugs in the MySQL connector that can produce change events with incorrect
source metadata, and that eliminates the possibility a poorly-timed connector crash causing the connector to only process some of the rows in a multi-row MySQL event. See our release notes for details of these changes and for upgrading recommendations.
Also, thanks to a community member for reporting that Debezium 0.2.x can only be used with Kafka Connect 0.9.0.1. Debezium 0.2.x cannot be used with Kafka Connect 0.10.0.0 because of its backward incompatible changes to the consumer API. Our next release of Debezium will support Kafka 0.10.x.
I’m happy to announce that Debezium 0.2.1 is now available. The MySQL connector has been significantly improved and is now able to monitor and produce change events for HA MySQL clusters using GTIDs, perform a consistent snapshot when starting up the first time, and has a completely redesigned event message structure that provides a ton more information with every event. Our change log has all the details about bugs, enhancements, new features, and backward compatibility notices. We’ve also updated our tutorial.
Update (Oct. 11 2019): An alternative, and much simpler, approach for running Debezium (and Apache Kafka and Kafka Connect in general) on Kubernetes is to use a K8s operator such as Strimzi. You can find instructions for the set-up of Debezium on OpenShift here, and similar steps apply for plain Kubernetes.
Our Debezium Tutorial walks you step by step through using Debezium by installing, starting, and linking together all of the Docker containers running on a single host machine. Of course, you can use things like Docker Compose or your own scripts to make this easier, although that would just automating running all the containers on a single machine. What you really want is to run the containers on a cluster of machines. In this blog, we’ll run Debezium using a container cluster manager from Red Hat and Google called Kubernetes.
Kubernetes is a container (Docker/Rocket/Hyper.sh) cluster management tool. Like many other popular cluster management and compute resource scheduling platforms, Kubernetes' roots are in Google, who is no stranger to running containers at scale. They start, stop, and cluster 2 billion containers per week and they contributed a lot of the Linux kernel underpinnings that make containers possible. One of their famous papers talks about an internal cluster manager named Borg. With Kubernetes, Google got tired of everyone implementing their papers in Java so they decided to implement this one themselves :)
Kubernetes is written in Go-lang and is quickly becoming the de-facto API for scheduling, managing, and clustering containers at scale. This blog isn’t intended to be a primer on Kubernetes, so we recommend heading over to the Getting Started docs to learn more about Kubernetes.
When our MySQL connector is reading the binlog of a MySQL server or cluster, it parses the DDL statements in the log and builds an in-memory model of each table’s schema as it evolves over time. This process is important because the connector generates events for each table using the definition of the table at the time of each event. We can’t use the database’s current schema, since it may have changed since the point in time (or position in the log) where the connector is reading.
Parsing DDL of MySQL or any other major relational database can seem to be a daunting task. Usually each DBMS has a highly-customized SQL grammar, and although the data manipulation language (DML) statements are often fairly close the standards, the data definition language (DDL) statements are usually less so and involve more DBMS-specific features.
So given this, why did we write our own DDL parser for MySQL? Let’s first look at what Debezium needs a DDL parser to do.
Debezium is a distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Now the good news — Debezium 0.1 is now available and includes several significant features:
A connector for MySQL to monitor MySQL databases. It’s a Kafka Connect source connector, so simply install it into a Kafka Connect service (see below) and use the service’s REST API to configure and manage connectors to each DBMS server. The connector reads the MySQL binlog and generates data change events for every committed row-level modification in the monitored databases. The MySQL connector generates events based upon the tables' structure at the time the row is changed, and it automatically handles changes to the table structures.
A small library so applications can embed any Kafka Connect connector and consume data change events read directly from the source system. This provides a much lighter weight system (since Zookeeper, Kafka, and Kafka Connect services are not needed), but as a consequence is not as fault tolerant or reliable since the application must maintain state normally kept inside Kafka’s distributed and replicated logs. Thus the application becomes completely responsible for managing all state.