Tag Archives: urllib.error.HTTPError

urllib.error.HTTPError http error 403 forbidden

Urllib.request-urlopen opens a URL, and the server will only receive a simple request for page access, but the server does not know the browser, operating system, hardware platform and other information used to send the request, and the request without this information is often non-normal access, such as crawler.
In order to prevent this abnormal access, some websites will verify the UserAgent in the request information (its information includes hardware platform, system software, application software, and user preferences). If the UserAgent is abnormal or does not exist, then the request will be rejected (as shown in the error message above).
Add the browser camouflage [see link for method].

headers = {'User-Agent':'Mozilla/5.0 3578.98 Safari/537.36'}
url = Request(url,headers=headers)
content = urlopen(url,timeout=15).read()