Get picture captcha with Python + Chrome

we’ll start by importing some libraries that we’ll use in our code:

import re  # 正则
import time  # 代码停顿执行
from selenium import webdriver  # 打开访问的网站
from PIL import Image  # 图片 安装PIL --> Pillow
import pytesseract  # 图片转文字

(if the above some library file is not installed, can be used in the terminal PIP command to install, or for installation in pyCharm oh, you can refer to https://blog.csdn.net/YuanLiYin079/article/details/108726138, the installation method of selenium in the article to try)

to get the captcha, we need to go to the browser we are going to visit (in this case, using the Google browser)

# chromedriver.exe文件放置的路径（根据自己的路径做适当的修改）
chrome_driver = r"C:\Users\Admin\AppData\Local\Programs\Python\Python37\Lib\site-packages\selenium\webdriver\chrome\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver)
driver.maximize_window()
driver.implicitly_wait(3)  # 等待3秒
login_url = 'https://我们要访问的登录页面的地址写在这里哦.com'
# 进入访问地址的登录页面
driver.get(login_url)
time.sleep(3)

enter the page, start to get the captcha!

# 获取图片验证码
# 1、全屏截图，设置要将图片放置的路径
driver.save_screenshot('D:\Python_work\images\image.png')
# 2、获取图片验证码坐标和大小
code_image = driver.find_element_by_class_name('verifyCodeImg')
code_location = code_image.location
code_image_size = code_image.size
time.sleep(2)
print("验证码的坐标为：", code_location)  # 控制台查看{'x': 716, 'y': 475}
print("验证码的大小为：", code_image_size)  # 图片大小{'height': 48, 'width': 140}

# 3、图片4个点的坐标位置
left = code_image.location['x']  # x点的坐标
top = code_image.location['y']  # y点的坐标
right = left + code_image.size['width']  # 上面右边点的坐标
Rdown = top + code_image.size['height']   # 下面右边点的坐标
image = Image.open('D:\Python_work\images\image.png')

# 4、将图片验证码截取
code_image = image.crop((left, top, right, Rdown))
code_image.save('D:\Python_work\images\image1.png')  # 截取的验证码图片保存为新的文件
codeStr = pytesseract.image_to_string(code_image)  # 图片转文字
# 5、去除识别出来的特殊字符
codeStrS = re.sub(u"([^\u4e00-\u9fa5\u0030-\u0039\u0041-\u005a\u0061-\u007a])", "", codeStr)
result_four = codeStrS[0:4]  # 只获取前4个字符
print(codeStrS)  # 打印识别的验证码

now we can see the obtained captcha we printed out in the console, perform your input operation, and see what happens!

install pytesseract,
download the tesseract_ocr file from https://github.com/UB-Mannheim/tesseract/wiki, install:
remember the path to install because it will be used later.

then, open found an error, open the pytesseract. Py files, Find tesseract_cmd, comment out the original, and add a new one: tesseract_cMD = “path /tesseract.exe”. Then execute the code, and it will execute successfully.

Read More:

[solution] Google Chrome browser hijacked by hao123 chrome://version Command line tampered

Open the top left corner of the chrome page to display the volume and playback chrome.exe Problem solving

Unknown error: cannot find chrome binary when running selenium under Linux

Python opencv (3) get image size

Get the current date in Python

Python uses the priority queue to get the maximum k elements

Vue picture path, webpack error resolution after packaging

Python – get the information of calling function from called function

Selenium driver chrome failed to start and reported an error

Beautiful soup gets the SRC of the picture in the page

Using uniapp uni.downloadFile Download picture to photo album times error unknown problem

Win10 can not open the picture, file system error (- 214721996) repair method

Pychart remote debugging display picture, tkagg error report troubleshooting

Can’t find Python executable “D:\python3\python.exe”, you can set the PYTHON env variable.

Chrome failed to install

how to install chrome in kali linux

Error saving Visio as picture: an error occurred and Visio was unable to complete the export

[solution] original error: Chrome not realizable

Ant Design upload listtype = “picture card” realizes multi image upload and click preview image encapsulation

Installing markdown viewer 3.9 plug-in for Chrome

ProgrammerAH

Programmer Guide, Tips and Tutorial

Get picture captcha with Python + Chrome

Read More: