python selenium-webdriver 登錄驗證碼的處理（十二）

本文轉載自查看原文 2017-06-29 17:47 11194 selenium-webdriver/ Python

很多系統為了防止壞人，會增加各樣形式的驗證碼，做測試最頭痛的莫過於驗證碼的處理，驗證碼的處理一般分為三種方法

1.開發給我們設置一個萬能的驗證碼；

2.開發將驗證碼給屏蔽掉；

3.自己識別圖片的上的千奇百怪的圖片，但是這樣的方法識別成功率不是特別的高，而且也不是對所有的都可以識別，只是識別一些簡單的驗證碼；

4.可以調用其他的服務商的驗證碼識別的接口，我從阿里雲的雲市場找到一家驗證碼識別的接口，0.9元可以調用大概是60次（0.01元20次，我不知道自己為何傻乎乎的花了0.9元買了60次的規格）；

這里主要使用到了pytesseract和PIL兩個模塊，首先我們搭建一下環境

pip install Pillow
pip install pytesseract 
由於Python-tesseract是一個基於google's Tesseract-OCR的獨立封裝包，那么我們需要下載Tesseract-OCR進行安裝，window下安裝記住需要配置環境變量

下面我們直接看一下具體的實例

#-*- coding:utf-8 -*-
import time
from selenium import webdriver
from PIL import Image,ImageEnhance
import pytesseract

def get_auth_code(driver,codeEelement):
    '''獲取驗證碼'''
    driver.save_screenshot('login/login.png')  #截取登錄頁面
    imgSize = codeEelement.size   #獲取驗證碼圖片的大小
    imgLocation = imgElement.location #獲取驗證碼元素坐標
    rangle = (int(imgLocation['x']),int(imgLocation['y']),int(imgLocation['x'] + imgSize['width']),int(imgLocation['y']+imgSize['height']))  #計算驗證碼整體坐標
    login = Image.open("login/login.png")  
    frame4=login.crop(rangle)   #截取驗證碼圖片
    frame4.save('login/authcode.png')
    authcodeImg = Image.open('login/authcode.png')
    authCodeText = pytesseract.image_to_string(authcodeImg).strip()
    return authCodeText

def pandarola_login(driver,account,passwd,authCode):
    '''登錄pandarola系統'''
    driver.find_element_by_id('loginname').send_keys(account)
    driver.find_element_by_id('password').send_keys(passwd)
    driver.find_element_by_id('code').send_keys(authCode)
    driver.find_element_by_id('to-recover').click()
    time.sleep(2)
    title = driver.find_element_by_id('menuName-h').text  #獲取登錄的標題
    '''驗證是否登錄成功'''
    try:
        assert title == u'桌面'
        return '登錄成功'
    except AssertionError as e:
        return '登錄失敗'

if __name__ == '__main__':

    driver = webdriver.Chrome()
    driver.get('http://pandarola.pandadata.cn')
    driver.maximize_window()
    imgElement = driver.find_element_by_id('codeImg')
    authCodeText = get_auth_code(driver,imgElement)
    pandarola_login(driver,'admin','1',authCodeText)
    driver.quit()

由於我們系統屬於內部系統，驗證碼比較簡單，所以很輕松的識別了，但是有時候2和Z無法識別，只要登錄失敗重新獲取再次登錄即可。畢竟自己寫的驗證碼識別，識別的成功的幾率還時比較低，所以我這邊從阿里雲的雲市場找到了一家公司，用他們的接口來識別，最起碼公司的幾個系統的驗證碼問題全部解決了，再也不用求開發了。

ps：

　　這里包含了接口的說明文檔，大概可以自己看一下，https://market.aliyun.com/products/57126001/cmapi014396.html#sku=yuncode839600006，我這邊演示下通過這個接口來識別驗證碼。

#-*- coding:utf-8 -*-
import base64
import requests
import json
def read_picture_base64(fileNmae):
    '''驗證碼圖片 base64加密格式'''
    with open(fileNmae,'rb') as f:
        base64Picture = base64.b64encode(f.read())
    return base64Picture.decode()


def authcode_picture_convert_string(appCode,querys,base64Picture):  #appCode 接口的認證key,querys 驗證碼類型
    '''通過第三方結果獲取驗證碼'''
    header = {
        'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8',   #根據API的要求，定義相對應的Content-Type
        "Authorization":"APPCODE "+ appCode
    }
    url = 'http://jisuyzmsb.market.alicloudapi.com/captcha/recognize'  #調用地址
    bodys = {'type':querys,'pic':base64Picture}   #請求參數
    res = requests.post(url,headers=header,data=bodys)
    return res.text

if __name__ == '__main__':
    appCode = '377e5f0fe10146ef9aa88bae756a3904'
    querys = 'e4'
    base64Picture = read_picture_base64('login/20170629232535.png')
    text = authcode_picture_convert_string(appCode,querys,base64Picture)
    authCode = json.loads(text)['result']['code']  #解析返回的結果
    print(authCode)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 【Selenium-WebDriver實戰篇】基於java的selenium之驗證碼識別內容 selenium驗證碼處理之cookie登錄如何利用python+Selenium對登錄的驗證碼進行驗證？ python selenium-webdriver 下拉菜單處理（九） selenium -驗證碼處理 selenium-python-Cookie跳過登錄驗證碼 selenium登錄京東滑動驗證碼 python selenium-webdriver 等待時間（七） python驗證碼處理(1) selenium處理極驗滑動驗證碼