1.git地址
https://github.com/onesuper/pandasticsearch
2.建立連接
from pandasticsearch import DataFrame
username = b'xxxx'
password = b'xxxx'
df = DataFrame.from_es(url='IP:9200',
index='x'x'x'x',
username=username,
password=password,
doc_type='x'x'x'x',
compat=5
)
[注] 實測python3 會遇到編碼問題
TypeError: a bytes-like object is required, not 'str'
3.修改源碼
將~/anaconda3/lib/python3.7/site-packages/pandasticsearch/client.py中
59 if username is not None and password is not None:
60 base64creds = base64.b64encode('%s:%s' % (username,password))
61 req.add_header("Authorization", "Basic %s" % base64creds)
修改為:
if username is not None and password is not None:
base64creds = bytes.decode(base64.b64encode(b'%s:%s' % (username,password)))
req.add_header("Authorization", "Basic %s" % base64creds)
4.批量查詢數據
limit()函數查詢前20萬條數據,to_pandas()轉成pandas的dataframe
pd_df = df.limit(200000).to_pandas()