[Solved] Canal Error: CanalParseException: column size is not match,parse row data failed

1、 Background phenomenon

Background: there is a problem with the company’s flick task, and the data is not written to the result library.

So immediately check the flick task. On the web page, there are no exceptions, checkpoints and backpressure
then the problem is not my program. The spearhead is directed at environmental problems

2、 Environmental investigation

First, I checked the log printed by the task manager of Flink and found that the data was consumed for a certain period of time, and no data came in later
it means that the data is not sent to the Flink program, so there is a problem at the source
after checking Kafka, it is found that there is no message backlog and the consumption rate is normal
then the problem is not Kafka. Then it can only come from a more original place: canal

3、 Culprit canal

The operation and maintenance boss checked the canal log:

com.alibaba.otter.canal.parse.exception.CanalParseException: com.alibaba.otter.canal.parse.exception.CanalParseException: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: column size is not match for table:xxxx.xxx,22 vs 21

Obviously, this log says that a new field has been added to the database, which is inconsistent with the number of fields in the previous database. It causes an error in canal, and then the message is not sent

At that time, I thought that canal must support DDL compatibility, which must be a problem with one of canal’s settings
then I went to GitHub to find the document of canal
during the search, an issue was found. The content in the issue is similar to mine.
the key points are:

 canal.instance.filter.query.ddl = true 

Semantically speaking, it is obvious that canal filters out the DDL statements of MySQL, so it is naturally impossible to perceive that MySQL has added a new field. In this way, when a new piece of data comes after adding a field, canal will report an error if it cannot match the number of fields.

Solution

canal.instance.filter.query.ddl = false

In this way, canal can receive DDL statements and adapt to the changes after adding new fields.

Read More: