Debezium Blog

MySQL 5.7 introduced a new data type for storing and working with JSON data. Clients can define tables with columns using the new JSON datatype, and they can store and read JSON data using SQL statements and new built-in JSON functions to construct JSON data from other relational columns, introspect the structure of JSON values, and search within and manipulate JSON data. It possible to define generated columns on tables whose values are computed from the JSON value in another column of the same table, and to then define indexes with those generated columns. Overall, this is really a very powerful feature in MySQL.

Debezium’s MySQL connector will support the JSON datatype starting with the upcoming 0.3.4 release. JSON document, array, and scalar values will appear in change events as strings with io.debezium.data.json for the schema name. This will make it natural for consumers to work with JSON data. BTW, this is the same semantic schema type used by the MongoDB connector to represent JSON data.

This sounds straightforward, and we hope it is. But implementing this required a fair amount of work. That’s because although MySQL exposes JSON data as strings to client applications, internally it stores all JSON data in a special binary form that allows the MySQL engine to efficiently access the JSON data in queries, JSON functions and generated columns. All JSON data appears in the binlog in this binary form as well, which meant that we had to parse the binary form ourselves if we wanted to extract the more useful string representation. Writing and testing this parser took a bit of time and effort, and ultimately we donated it to the excellent MySQL binlog client library that the connector uses internally to read the binlog events.