混亂的喔喔喔: Python識別簡單的圖形驗證碼

其實以下內容主要都來自這篇 http://www.waitalone.cn/python-php-ocr.html
但是原作者所用的開發環境是windows，身為Linux的愛好者當然要把他拿來Linux上玩玩
要使用這個套件首先需要安裝三個函式庫，分別為PIL、tesseract-ocr和pytesseract
很簡單的只要使用以下指令即可

sudo apt-get build-dep python-imaging
sudo apt-get install libjpeg8 libjpeg62-dev libfreetype6 libfreetype6-dev
sudo apt-get install tesseract-ocr
sudo easy_install pip
sudo pip install Pillow
sudo pip install pytesseract

安裝好後就可以開始寫code了，輸入以下


try:
    import pytesseract
    from PIL import Image
except ImportError:
    print "Error, Please install pytesseract and PIL"
    raise SystemExit
image = Image.open('captcha.png') #圖檔名稱
number = pytesseract.image_to_string(image)
print "The captcha is:%s" % number

若有顯示出名稱了話就代表成功辨識




於是就開始各種實驗了，試了以下圖片



print: The captcha is:231074



print: The captcha is:6K4L2



print: The captcha is:WMBHP



print: The captcha is:qub JD



print: The captcha is:




實驗結果：

看來成效不彰啊......( = A = |||

不過可以知道的是他支援可辨識的圖檔有.jpg .gif .png......其他還沒測試

那麼下次見吧 (´・ω・)

混亂的喔喔喔

2015年3月18日星期三

Python識別簡單的圖形驗證碼

1 則留言:

2015年3月18日 星期三

Python識別簡單的圖形驗證碼

1 則留言:

2015年3月18日星期三