ElasticSearch學習筆記——ik分詞添加詞庫

本文轉載自查看原文 2021-01-07 15:52 381 ELK

前置條件是安裝ik分詞，請參考

1.在ik分詞的config下添加詞庫文件

~/software/apache/elasticsearch-6.2.4/config/analysis-ik$ ls | grep mydic.dic
mydic.dic

內容為

我給祖國獻石油

2.配置詞庫路徑，編輯IKAnalyzer.cfg.xml配置文件，添加新增的詞庫

3.重啟es

4.測試

data.json

{
        "analyzer":"ik_max_word",
        "text": "我給祖國獻石油"
}

添加之后的ik分詞結果

curl -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty=true -d@data.json
{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "給",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "祖國",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "獻",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "CN_CHAR",
      "position" : 3
    },
    {
      "token" : "石油",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 4
    }
  ]
}

添加之后的ik分詞結果，分詞結果的tokens中增加了 "我給祖國獻石油"

curl -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty=true -d@data.json
{
  "tokens" : [
    {
      "token" : "我給祖國獻石油",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "祖國",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "獻",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "CN_CHAR",
      "position" : 2
    },
    {
      "token" : "石油",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    }
  ]
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 elasticsearch擴展ik分詞器詞庫 solr添加IK分詞和自己定義詞庫 ElasticSearch7.3學習(十五)----中文分詞器(IK Analyzer)及自定義詞庫 elasticsearch ik分詞器自定義詞庫 Elasticsearch之中文分詞器插件es-ik的自定義詞庫 Elasticsearch之中文分詞器插件es-ik的自定義詞庫 elasticsearch ik分詞 Elasticsearch實踐（四）：IK分詞 ES添加elasticsearch-analysis-ik分詞器【ES】elasticsearch添加ik分詞器后啟動失敗