This error occurred because the server rejected the request using
wget or curl
. In order to prevent crawlers from consuming the server resources, the server selectively screened the request headers according to your request. Therefore, the Agent user-agent of wget and curl needs to be modified for camouflage.
I. Modify user-Agent of WGET
1. Temporarily change THE UA of WGET
Before wget, add the parameter
-u
, which means to set User Agent
wget www.google.com -U "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
What is User Agent and how to get it, please refer to the following blog. Of course, you can also use the above one directly.
What is the UserAgent and how to view the UserAgent using the browser
2. Permanently change THE UA of Wget
Modify the configuration file /etc/wgetrc to add the following line:
header = User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
The above configuration will take effect for all users. If you only want the current user to take effect, you can add the above line to ~/.wgetrc. If this file is not available, you can create it manually.
2. Modify the user-Agent of curl
1. Temporary change to Curl’s UA
Use the following parameters:
curl https://www.google.com --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
2. Permanently change Curl’s UA
Modify the profile ~/.curlrc to add the following line:
–user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
Reference:
1, https://www.linpx.com/p/on-an-interesting-play-wget-use.html
2, https://chaifeng.com/_curl_wget_user-agent/