.
we’ll start by importing some libraries that we’ll use in our code:
import re # 正则
import time # 代码停顿执行
from selenium import webdriver # 打开访问的网站
from PIL import Image # 图片 安装PIL --> Pillow
import pytesseract # 图片转文字
(if the above some library file is not installed, can be used in the terminal PIP command to install, or for installation in pyCharm oh, you can refer to https://blog.csdn.net/YuanLiYin079/article/details/108726138, the installation method of selenium in the article to try) p>
to get the captcha, we need to go to the browser we are going to visit (in this case, using the Google browser)
# chromedriver.exe文件放置的路径(根据自己的路径做适当的修改)
chrome_driver = r"C:\Users\Admin\AppData\Local\Programs\Python\Python37\Lib\site-packages\selenium\webdriver\chrome\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver)
driver.maximize_window()
driver.implicitly_wait(3) # 等待3秒
login_url = 'https://我们要访问的登录页面的地址写在这里哦.com'
# 进入访问地址的登录页面
driver.get(login_url)
time.sleep(3)
enter the page, start to get the captcha!
# 获取图片验证码
# 1、全屏截图,设置要将图片放置的路径
driver.save_screenshot('D:\Python_work\images\image.png')
# 2、获取图片验证码坐标和大小
code_image = driver.find_element_by_class_name('verifyCodeImg')
code_location = code_image.location
code_image_size = code_image.size
time.sleep(2)
print("验证码的坐标为:", code_location) # 控制台查看{'x': 716, 'y': 475}
print("验证码的大小为:", code_image_size) # 图片大小{'height': 48, 'width': 140}
# 3、图片4个点的坐标位置
left = code_image.location['x'] # x点的坐标
top = code_image.location['y'] # y点的坐标
right = left + code_image.size['width'] # 上面右边点的坐标
Rdown = top + code_image.size['height'] # 下面右边点的坐标
image = Image.open('D:\Python_work\images\image.png')
# 4、将图片验证码截取
code_image = image.crop((left, top, right, Rdown))
code_image.save('D:\Python_work\images\image1.png') # 截取的验证码图片保存为新的文件
codeStr = pytesseract.image_to_string(code_image) # 图片转文字
# 5、去除识别出来的特殊字符
codeStrS = re.sub(u"([^\u4e00-\u9fa5\u0030-\u0039\u0041-\u005a\u0061-\u007a])", "", codeStr)
result_four = codeStrS[0:4] # 只获取前4个字符
print(codeStrS) # 打印识别的验证码
now we can see the obtained captcha we printed out in the console, perform your input operation, and see what happens!
install pytesseract,
download the tesseract_ocr file from https://github.com/UB-Mannheim/tesseract/wiki, install:
remember the path to install because it will be used later.
then, open found an error, open the pytesseract. Py files, Find tesseract_cmd, comment out the original, and add a new one: tesseract_cMD = “path /tesseract.exe”. Then execute the code, and it will execute successfully.
Read More:
- [solution] Google Chrome browser hijacked by hao123 chrome://version Command line tampered
- Open the top left corner of the chrome page to display the volume and playback chrome.exe Problem solving
- Unknown error: cannot find chrome binary when running selenium under Linux
- Python opencv (3) get image size
- Get the current date in Python
- Python uses the priority queue to get the maximum k elements
- Vue picture path, webpack error resolution after packaging
- Python – get the information of calling function from called function
- Selenium driver chrome failed to start and reported an error
- Beautiful soup gets the SRC of the picture in the page
- Using uniapp uni.downloadFile Download picture to photo album times error unknown problem
- Win10 can not open the picture, file system error (- 214721996) repair method
- Pychart remote debugging display picture, tkagg error report troubleshooting
- Can’t find Python executable “D:\python3\python.exe”, you can set the PYTHON env variable.
- Chrome failed to install
- how to install chrome in kali linux
- Error saving Visio as picture: an error occurred and Visio was unable to complete the export
- [solution] original error: Chrome not realizable
- Ant Design upload listtype = “picture card” realizes multi image upload and click preview image encapsulation
- Installing markdown viewer 3.9 plug-in for Chrome