Description of the bug
Technology stack
nginxuwsgibottle
Error details
Alarm robots often have the following warnings:
<27>1 2018-xx-xxT06:59:03.038Z 660ece0ebaad admin/admin 14 - - Socket Error: 104
<31>1 2018-xx-xxT06:59:03.038Z 660ece0ebaad admin/admin 14 - - Removing timeout for next heartbeat interval
<28>1 2018-xx-xxT06:59:03.039Z 660ece0ebaad admin/admin 14 - - Socket closed when connection was open
<31>1 2018-xx-xxT06:59:03.039Z 660ece0ebaad admin/admin 14 - - Added: {'callback': <bound method SelectConnection._on_connection_start of <pika.adapters.select_connection.SelectConnection object at 0x7f74752525d0>>, 'only': None, 'one_shot': True, 'arguments': None, 'calls': 1}
<28>1 2018-xx-xxT06:59:03.039Z 660ece0ebaad admin/admin 14 - - Disconnected from RabbitMQ at xx_host:5672 (0): Not specified
<31>1 2018-xx-xxT06:59:03.039Z 660ece0ebaad admin/admin 14 - - Processing 0:_on_connection_closed
<31>1 2018-xx-xxT06:59:03.040Z 660ece0ebaad admin/admin 14 - - Calling <bound method _CallbackResult.set_value_once of <pika.adapters.blocking_connection._CallbackResult object at 0x7f74752513f8>> for "0:_on_connection_closed"
The debug process
Determine the error location
If you have a log, it’s easy to do. First of all, look at where the log is typed.
Our own code
No.
Uwsgi code
root@660ece0ebaad:/# uwsgi –version
2.0.14
from github co down, no.
Python Library code
Execute in the container
>>> import sys
>>> sys.path
['', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0']
Under these directories, grep is found in pika
root@660ece0ebaad:/usr/local/lib/python2.7# grep "Socket Error" -R .
Binary file ./dist-packages/pika/adapters/base_connection.pyc matches
./dist-packages/pika/adapters/base_connection.py: LOGGER.error("Fatal Socket Error: %r", error_value)
./dist-packages/pika/adapters/base_connection.py: LOGGER.error("Socket Error: %s", error_code)
Determine the PIKA version.
>>> import pika
>>> pika.__version__
'0.10.0'
Error determination logic
It can be seen from the code that the Socket Error is the Error code of errno, and it is determined that the meaning of the Error code is that RST is sent to the opposite end.
>>> import errno
>>> errno.errorcode[104]
'ECONNRESET'
Suspected rabbitmq server address error, an unlistened port will return RST, after verification found that it is not.
then suspected link timeout break without notifying the client, etc. Take a look at the RabbitMQ Server logs and find a large number of:
=ERROR REPORT==== 7-Dec-2018::20:43:18 ===
closing AMQP connection <0.9753.18> (172.17.0.19:27542 -> 192.168.44.112:5672):
missed heartbeats from client, timeout: 60s
--
=ERROR REPORT==== 7-Dec-2018::20:43:18 ===
closing AMQP connection <0.9768.18> (172.17.0.19:27544 -> 192.168.44.112:5672):
missed heartbeats from client, timeout: 60s
It is found that all the links between RabbitMQ Server and Admin Docker have been broken
root@xxxxxxx:/home/dingxinglong# netstat -nap | grep 5672 | grep "172.17.0.19"
So why does RabbitMQ Server kick out piKA’s links?Look at the PIKA code comment:
:param int heartbeat_interval: How often to send heartbeats.
Min between this value and server's proposal
will be used. Use 0 to deactivate heartbeats
and None to accept server's proposal.
We did not pass in the heartbeat interval, so in theory we should use the server default 60S. In fact, the client has never sent heartbeat packets.
verifies that a HeartbeatChecker object has been successfully created and a timer has been successfully created by printing, but the timer has never been called back.
follows through the code using blocking_connections, as seen in its add_timeout comment:
def add_timeout(self, deadline, callback_method):
"""Create a single-shot timer to fire after deadline seconds. Do not
confuse with Tornado's timeout where you pass in the time you want to
have your callback called. Only pass in the seconds until it's to be
called.
NOTE: the timer callbacks are dispatched only in the scope of
specially-designated methods: see
`BlockingConnection.process_data_events` and
`BlockingChannel.start_consuming`.
:param float deadline: The number of seconds to wait to call callback
:param callable callback_method: The callback method with the signature
callback_method()
The timer is triggered by process_data_Events and we are not calling it. So the client heartbeat is never triggered. Simply turn off heartbeat to solve this problem.
Specific trigger point
follows the code of basic_publish interface.
receives RST when sending, and finally prints socket_error in base_connection.py:452, _handle_error function.
def connect_mq():
mq_conf = xxxxx
connection = pika.BlockingConnection(
pika.ConnectionParameters(mq_conf['host'],
int(mq_conf['port']),
mq_conf['path'],
pika.PlainCredentials(mq_conf['user'],
mq_conf['pwd']),
heartbeat_interval=0))
channel = connection.channel()
channel.exchange_declare(exchange=xxxxx, type='direct', durable=True)
return channel
channel = connect_mq()
def notify_xxxxx():
global channel
def _publish(product):
channel.basic_publish(exchange=xxxxx,
routing_key='xxxxx',
body=json.dumps({'msg': 'xxxxx'}))
Read More:
- What should be paid attention to in socket programming — bind socket error: address already in use
- [Linux] [kernel] bug: scheduling while atomic problem analysis
- MP-BIOS bug: 8254 timer not connected to IO-APIC
- Http Error 12057 (Bug Fix Note)
- A little bug of CSDN blog
- Destructor abnormal stuck bug
- BUG:soft lockup – CPU#0 stuck for 67s! [migration/0:5]
- Unity bug solution — invalid AABB inaabb
- Socket error 10053
- Elasticsearch6. X invalid time range query bug
- Fatal error in GC getthreadcontext failed bug exception.
- Executable file not found in $path bug solution
- A first chance exception of type ‘ System.NullReferenceException ‘when occurred, you did encounter a bug
- Python bug: cannot install ‘Django’. It is a distutils installed project and thus we cannot
- MacOS: How to Fix Intellij-IDEA main menu disappears Bug
- Bug resolution of 0xc0000005: access conflict occurred when reading location of 0x00000000.
- BUG: Bad page map in process XXX pte:800000036fae6227 pmd:35be8c067
- Bug:Install Microsoft Visual C++ 2008 Redistributable (x86) Failed Installation aborted, Result=1603
- socket error 10035
- [Solved] Rabbitmq Warning: java.net.SocketException: socket closed