import requests
url=['www....','www.....',...]
for i in range(0,len(url)):
linkhtml = requests.get(url[i])
The crawler reported the following error:
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 247, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\lenovo7\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
Refer to an article on stack overflow
Python: urllib.error.HTTPError: HTTP Error 404: Not Found – Stack Overflow
In the crawler scenario, the original link may not open. Naturally, it will prompt HTTP Error 404. What you need to do is skip this link and then crawl to the following page
Correction code
import requests
url=['www....','www.....',...]
for i in range(0,len(url)):
try:
linkhtml = requests.get(url[i])
except:
pass
Read More:
- urllib2.HTTPError: HTTP Error 403: Forbidden
- Urllib2.httperror: http error 403: forbidden solution
- Python crawler: urllib.error.HTTPError : HTTP Error 404: Not Found
- HTTPError HTTP Error 500 INTERNAL SERVER ERROR
- Python crawler urllib.error.HTTPError : HTTP Error 418:
- urlopen error unknown url type:httpë/HTTP Error 400:Bad Request
- Resolve the error raise importerror, str (MSG) + ‘, please install the python TK package’ (valid for personal testing)
- HTTP error 404.8 – not found, the request filtering module is configured to reject the path in the URL containing the hiddensegment section
- [ ERROR ] Error loading xmlfile: squeezenet1.1\FP16\squeezenet1.1.xml, File was not found at line: 1
- raise LookupError(resource_not_found)
- python: HTTP Error 505: HTTP Version Not Supported
- Conda HTTP 000 CONNECTION FAILED for url
- Solving attributeerror: module ‘urllib’ has no attribute ‘request’
- Browser prompt: Source mapping error: request failed with status 404 Source URL: http://xxx.js Source mapping URL: jquery.min.map
- Solve geforce error code with super full solution CODE:0x0003 Problem approach
- curl: (22) The requested URL returned error: 404 Not Found Server does not provide clone.bundle; ig
- python root:code for Hash MD5 was not found. Error
- JMeter running error response code: non HTTP response code: java.lang.illegalargumentexception find and solve
- Condahttpererror: http 000 connection failed for URL problem in CONDA installation package
- Alicloud CentOS 5 old version yum/ repomd.xml : [Errno 14] HTTP Error 404: Not Found