Debezium Blog

Streaming Cassandra at WePay - Part 2 Streaming Cassandra at WePay - Part 2

July 15, 2019 by Joy Gao

This post originally appeared on the WePay Engineering blog.

In the first half of this blog post series, we explained our decision-making process of designing a streaming data pipeline for Cassandra at WePay. In this post, we will break down the pipeline into three sections and discuss each of them in more detail:

Cassandra to Kafka with CDC agent
Kafka with BigQuery with KCBQ
Transformation with BigQuery view

Streaming Cassandra at WePay - Part 1 Streaming Cassandra at WePay - Part 1

July 12, 2019 by Joy Gao

cassandra

This post originally appeared on the WePay Engineering blog.

Historically, MySQL had been the de-facto database of choice for microservices at WePay. As WePay scales, the sheer volume of data written into some of our microservice databases demanded us to make a scaling decision between sharded MySQL (i.e. Vitess) and switching to a natively sharded NoSQL database. After a series of evaluations, we picked Cassandra, a NoSQL database, primarily because of its high availability, horizontal scalability, and ability to handle high write throughput.

Tutorial for Adding Sentry into Debezium Container Images Tutorial for Adding Sentry into Debezium Container Images

July 8, 2019 by Renato Mefi

sentry docker

Debezium has received a huge improvement to the structure of its container images recently, making it extremely simple to extend its behaviour.

This is a small tutorial showing how you can for instance add Sentry, "an open-source error tracking [software] that helps developers monitor and fix crashes in real time". Here we’ll use it to collect and report any exceptions from Kafka Connect and its connectors. Note that this is only applicable for Debezium 0.9+.

We need a few things to have Sentry working, and we’ll add all of them and later have a Dockerfile which gets it all glued correctly:

Configure Log4j
SSL certificate for sentry.io, since it’s not by default in the JVM trusted chain
The sentry and sentry-log4j libraries

Debezium 0.10.0.Beta2 Released Debezium 0.10.0.Beta2 Released

June 28, 2019 by Gunnar Morling

releases mysql postgres mongodb sqlserver oracle docker

It’s my pleasure to announce the release of Debezium 0.10.0.Beta2!

This further stabilizes the 0.10 release line, with lots of bug fixes to the different connectors. 23 issues were fixed for this release; a couple of those relate to the DDL parser of the MySQL connector, e.g. around RENAME INDEX (DBZ-1329), SET NEW in triggers (DBZ-1331) and function definitions with the COLLATE keyword (DBZ-1332).

For the Postgres connector we fixed a potential inconsistency when flushing processed LSNs to the database (DBZ-1347). Also the "include.unknown.datatypes" option works as expected now during snapshotting (DBZ-1335) and the connector won’t stumple upon materialized views during snapshotting any longer (DBZ-1345).

Debezium Wears Fedora Debezium Wears Fedora

June 19, 2019 by Jiri Pechanec

postgres fedora vagrant

The Debezium project strives to provide an easy deployment of connectors, so users can try and run connectors of their choice mostly by getting the right connector archive and unpacking it into the plug-in path of Kafka Connect.

This is true for all connectors but for the Debezium PostgreSQL connector. This connector is specific in the regard that it requires a logical decoding plug-in to be installed inside the PostgreSQL source database(s) themselves. Currently, there are two supported logical plug-ins:

postgres-decoderbufs, which uses Protocol Buffers as a very compact transport format and which is maintained by the Debezium community
JSON-based, which is based on JSON and which is maintained by its own upstream community