Debezium Blog

The summer is at its peak but Debezium community is not relenting in its effort so the Debezium 0.10.0.Beta3 is released.

This version not only continues in incremental improvements of Debezium but also brings new shiny features.

All of you who are using PostgreSQL 10 and higher as a service offered by different cloud providers definitely felt the complications when you needed to deploy logical decoding plugin necessary to enable streaming. This is no longer necessary. Debezium now supports (DBZ-766) pgoutput replication protocol that is available out-of-the-box since PostgreSQL 10.

This post originally appeared on the WePay Engineering blog.

In the first half of this blog post series, we explained our decision-making process of designing a streaming data pipeline for Cassandra at WePay. In this post, we will break down the pipeline into three sections and discuss each of them in more detail:

  1. Cassandra to Kafka with CDC agent

  2. Kafka with BigQuery with KCBQ

  3. Transformation with BigQuery view

This post originally appeared on the WePay Engineering blog.

Historically, MySQL had been the de-facto database of choice for microservices at WePay. As WePay scales, the sheer volume of data written into some of our microservice databases demanded us to make a scaling decision between sharded MySQL (i.e. Vitess) and switching to a natively sharded NoSQL database. After a series of evaluations, we picked Cassandra, a NoSQL database, primarily because of its high availability, horizontal scalability, and ability to handle high write throughput.

Debezium has received a huge improvement to the structure of its container images recently, making it extremely simple to extend its behaviour.

This is a small tutorial showing how you can for instance add Sentry, "an open-source error tracking [software] that helps developers monitor and fix crashes in real time". Here we’ll use it to collect and report any exceptions from Kafka Connect and its connectors. Note that this is only applicable for Debezium 0.9+.

We need a few things to have Sentry working, and we’ll add all of them and later have a Dockerfile which gets it all glued correctly:

  • Configure Log4j

  • SSL certificate for sentry.io, since it’s not by default in the JVM trusted chain

  • The sentry and sentry-log4j libraries

It’s my pleasure to announce the release of Debezium 0.10.0.Beta2!

This further stabilizes the 0.10 release line, with lots of bug fixes to the different connectors. 23 issues were fixed for this release; a couple of those relate to the DDL parser of the MySQL connector, e.g. around RENAME INDEX (DBZ-1329), SET NEW in triggers (DBZ-1331) and function definitions with the COLLATE keyword (DBZ-1332).

For the Postgres connector we fixed a potential inconsistency when flushing processed LSNs to the database (DBZ-1347). Also the "include.unknown.datatypes" option works as expected now during snapshotting (DBZ-1335) and the connector won’t stumple upon materialized views during snapshotting any longer (DBZ-1345).