These days when I do crawler with Python, I always encounter Python URLError: < urlopen error timed out> The more threads open, the more frequently they occur. Later proved that the program did not have a problem, checked the information on the Internet, the network is blocked, more than a few times to try it.
For example, the original code looks like this:
For example, the original code looks like this:
if headers:
req=urllib2.Request(url,headers=headers)
else :
req=urllib2.Request(url)
opener=urllib2.build_opener(cookieproc)
urllib2.install_opener(opener)
page=urllib2.urlopen(req,timeout=5).read()
can be changed after adding a loop:
if headers:
req=urllib2.Request(url,headers=headers)
else :
req=urllib2.Request(url)
opener=urllib2.build_opener(cookieproc)
urllib2.install_opener(opener)
global Max_Num
Max_Num=6
for i in range(Max_Num):
try:
page=urllib2.urlopen(req,timeout=5).read()
break
except:
if i < Max_Num-1:
continue
else :
print 'URLError: <urlopen error timed out> All times is failed '
This exception can generally be resolved