老實說,官網文檔有點雲山霧罩。
windows下又不讓用nvidia-docker,只好anaconda的方式裝
綜合
https://www.paddlepaddle.org.cn/install/quick
和
https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/installation_en.md
配置conda 國內鏡像
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/ conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ conda config --set show_channel_urls yes
1 建立3.7虛環境
conda create --name paddle python=3.7 activate paddle
一定要有版本
2 安裝 paddlepaddle-gpu 2.0.0版
只能用pip從百度的鏡像里安裝,conda和公網的pypi都沒有這個版本
百度的鏡像里只有2.0.0a0, docker版才有
2.0.0b0
python -m pip install paddlepaddle-gpu==2.0.0a0 -i https://mirror.baidu.com/pypi/simple
3.7下只有 2.0.0a0
3 paddleOCR
3.1 下載源碼,安裝依賴
git clone https://gitee.com/paddlepaddle/PaddleOCR cd PaddleOCR pip install -r requirments.txt
3.2 手動下載安裝shapely
https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely
3.3 手動解改名、解壓縮、復制dll!
https://github.com/PaddlePaddle/PaddleOCR/issues/212
從您提供的網址下載了 Shapely-1.7.0-cp39-cp39-win_amd64.whl,pip install 此whl文件不成功。
於是更名為Shapely-1.7.0-cp39-cp39-win_amd64.rar,
然后解壓縮,從其子目錄shapely\DLLs\中找到geos_c.dll,並將geos_c.dll拷貝到conda的環境(我的命名是ocr)目錄 C:\Users\myusername\Miniconda3\envs\ocr\Library\bin中。問題解決。
3.4 修改PaddleOCR/paddleocr.py
然后參考
https://github.com/PaddlePaddle/PaddleOCR/issues/832
在PaddleOCR/paddleocr.py中,找到def parse_args():
在下面加入下面這行
parser.add_argument("--use_pdserving", type=bool, default=False)
否則后面運行會報錯
Namespace(cls=False, cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='C:\\Users\\xuqinghan/.paddleocr/cls', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.5, det_db_thresh=0.3, det_db_unclip_ratio=2.0, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_max_side_len=960, det_model_dir='C:\\Users\\xuqinghan/.paddleocr/det', enable_mkldnn=False, gpu_mem=8000, image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='./ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='C:\\Users\\xuqinghan/.paddleocr/rec/ch', use_angle_cls=False, use_gpu=True, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False)
Traceback (most recent call last):
File "D:\dev\chem\chart-nmr-spectrum\test_paddle_ocr.py", line 10, in <module>
ocr = PaddleOCR() # need to run only once to download and load model into memory
File "D:\Users\xuqinghan\anaconda3\envs\paddle\lib\site-packages\paddleocr-1.0.0-py3.7.egg\paddleocr\paddleocr.py", line 222, in __init__
super().__init__(postprocess_params)
File "D:\Users\xuqinghan\anaconda3\envs\paddle\lib\site-packages\paddleocr-1.0.0-py3.7.egg\paddleocr\tools\infer\predict_system.py", line 41, in __init__
self.text_detector = predict_det.TextDetector(args)
File "D:\Users\xuqinghan\anaconda3\envs\paddle\lib\site-packages\paddleocr-1.0.0-py3.7.egg\paddleocr\tools\infer\predict_det.py", line 77, in __init__
if args.use_pdserving is False:
AttributeError: 'Namespace' object has no attribute 'use_pdserving'
3.5 安裝
cd PaddleOCR python setup.py install
最終 運行個demo 應該不會報錯了
import os from paddleocr import PaddleOCR, draw_ocr if __name__ == '__main__': PATH_IMG_IN = './in' filename = os.path.join(PATH_IMG_IN, '1.png') ocr = PaddleOCR() # need to run only once to download and load model into memory start = time.perf_counter() result = ocr.ocr(filename, rec=False) end = time.perf_counter() print('檢測文字區域 耗時{}'.format(end-start)) #每個矩形,從左上角順時針排列 for rect1 in rects: print(rect1)
小結:
文檔混亂,安裝過程到處是坑。但是看在效果還湊合的份上,湊合用吧