Springboot queries Doris with an error
ERROR [http-nio-10020-exec-12] [http-nio-10020-exec-12raceId] [] [5] @@GlobalExceptionAdvice@@ | server error
org.springframework.dao.RecoverableDataAccessException:
### Error querying database. Cause: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 426 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
; Communications link failure
The last packet successfully received from the server was 426 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.; nested exception is com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 426 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
An error is reported in the insert into select task scheduled by Doris
ERROR 2013 (HY000) at line 7: Lost connection to MySQL server during query
analysis
It may be that slow queries cause huge pressure on the cluster.
several slow queries reach 120s-400s, which is unbearable for the Doris cluster because of the global query_ The timeout parameter is 60. It is assumed that the task session variable of someone is set to 600s or higher
Let the development offline slow query task and the tuning SQL
slow query task for more than 100 seconds work normally after offline
But after a while, the springboot service alarms. There are mistakes again
Doris parameter
interactive_timeout=3880000
wait_timeout=3880000
Doris Fe service node alarm log
2021-06-03 16:00:08,398 WARN (Connect-Scheduler-Check-Timer-0|79) [ConnectContext.checkTimeout():365] kill wait timeout connection, remote: 1.1.1.1:57399, wait timeout: 3880000
2021-06-03 16:00:08,398 WARN (Connect-Scheduler-Check-Timer-0|79) [ConnectContext.kill():339] kill timeout query, 1.1.1.1.1:57399, kill connection: true
Doris monitoring
It can be seen that the number of connections at 15:44 drops sharply
#Elk log
you can also see that the alarm and error messages of Doris queried by springboot service also start at 15:44
so what operation variables affect the cluster at 15:44?
See waite according to the error report
_ The time is 3880000s, which is 44 days, but the default in the source code is 28800s
interactive_timeout=3880000
wait_timeout=3880000
No one went online, no one cut, and the Cluster Administrator was in my hands. I didn’t change the parameters, but I’m still not sure why the parameters will change. Go to the fe.audit audit audit log to check the operation records. Sure enough,
someone ( insider mark>) was using the 2020.2.3 version of DataGrid. At 15:44, the set global parameters were modified
interactive_timeout=3880000
wait_timeout=3880000
call back the two parameters to 28800s mark>, and the connections of the cluster are restored immediately
it should be noted here that in the discussion with the community, there is only wait in Doris_ Timeout
works, and the other is interactive_ Timeout
in order to be compatible with MySQL, it doesn’t work
Question: why wait in Doris_ When the timeout parameter is too large, it will cause a connection error communications link failure Code>
on the contrary, it can return to normal after being reduced. You need to sort out the code and look at the logic
Please check the
connection Doris error communications link failure
Read More:
- Solution to communication link failure with error in idea startup project
- Mysql database error (communications link failure)
- SQL Error: 0, SQLState: 08S01 & Communications link failure
- Sqoop import error communications link failure
- Cause: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure (How to Fix)
- Resolve the exception MySQL lontransientconnectionexception: communications link failure during rollback()
- Link: fatal error lnk1123: failure during conversion to coff: file in
- When executing hive – f script com.mysql.jdbc . exceptions.jdbc4 .CommunicationsException: Communications link failure
- LinkIssue: Error ‘LINK : fatal error LNK1123: failure during conversion to COFF: file invalid or cor
- VS2010 error: LINK : fatal error LNK1123: failure during conversion to COFF: file invalid or corrupt
- Got an error reading communication packets
- Doris decommission be node stuck [How to Solve]
- Aidl communication and problems encountered
- Doris query task failed to initialize storage reader
- Invalid cluster ID. ignore in building Doris database environment
- To be solved: one SSD failure, format failure
- Mac compiles Doris with MVN and reports an error checkstyle
- Common problems of Aidl cross process communication
- Centos6 suddenly cannot access the network VM communication interface socket family: failed
- [ABAP] sproxy opens ESR and reports an error has occurred during communication ESR