elasticsearch 進行分詞測試


1,打開kibana:

GET /scddb/_analyze
{
"text": "藍瘦香菇",
"analyzer": "ik_max_word"   //ik_smart
}

測試分詞效果如下,不是很理想:

{
"tokens" : [
{
"token" : "藍",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "瘦",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "香菇",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
}
]
}

添加自定義詞庫:

參考這里添加自定義IK詞庫:https://blog.csdn.net/makang456/article/details/79211255

重啟:service elasticsearch restart

再測試:

{
"tokens" : [
{
"token" : "藍瘦香菇",
"start_offset" : 0,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 0
}
]
}

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM