elasticsearch插件安裝之--中文分詞器 ik 安裝


/**

 * 系統環境: vm12 下的centos 7.2

 * 當前安裝版本: elasticsearch-2.4.0.tar.gz

 */

ElasticSearch中內置了許多分詞器, standard, english, chinese等, 中文分詞效果不佳, 所以使用ik

安裝ik分詞器

下載鏈接: https://github.com/medcl/elasticsearch-analysis-ik/releases

版本對應關系: https://github.com/medcl/elasticsearch-analysis-ik

 

關閉elasticsearch.bat,將下載下來的壓縮文件解壓,在ES目錄中的plugins文件夾里新建名為ik的文件夾,將解壓得到的所有文件復制到ik中。

unzip elasticsearch-analysis-ik-1.10.0.zip

  確認 plugin-descriptor.properties 中的版本和安裝的elasticsearch版本一直, 否則報異常

在elasticsearch.yml中增加ik設置

index.analysis.analyzer.ik.type : “ik”  

    或者添加: 

index:
  analysis:
    analyzer:
      ik:
          alias: [ik_analyzer]
          type: org.elasticsearch.index.analysis.IkAnalyzerProvider
      ik_max_word:
          type: ik
          use_smart: false
      ik_smart:
          type: ik
          use_smart: true

 

重新啟動elasticsearch

 

 

注意: 不可將zip包放在在ik目錄同級, 否則報錯

Exception in thread "main" java.lang.IllegalStateException: Could not load plugin descriptor for existing plugin [elasticsearch-analysis-ik-1.10.0.zip]. Was the plugin built before 2.0?
Likely root cause: java.nio.file.FileSystemException: /usr/work/elasticsearch/elasticsearch-2.4.0/plugins/elasticsearch-analysis-ik-1.10.0.zip/plugin-descriptor.properties: 不是目錄
    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
    at java.nio.file.Files.newByteChannel(Files.java:361)
    at java.nio.file.Files.newByteChannel(Files.java:407)
    at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
    at java.nio.file.Files.newInputStream(Files.java:152)
    at org.elasticsearch.plugins.PluginInfo.readFromProperties(PluginInfo.java:87)
    at org.elasticsearch.plugins.PluginsService.getPluginBundles(PluginsService.java:378)
    at org.elasticsearch.plugins.PluginsService.<init>(PluginsService.java:128)
    at org.elasticsearch.node.Node.<init>(Node.java:158)
    at org.elasticsearch.node.Node.<init>(Node.java:140)
    at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
Refer to the log for complete error details.

 

 

測試: 

首先配置: 

curl -XPUT localhost:9200/local -d '{
    "settings" : {
        "analysis" : {
            "analyzer" : {
                "ik" : {
                    "tokenizer" : "ik"
                }
            }
        }
    },
    "mappings" : {
        "article" : {
            "dynamic" : true,
            "properties" : {
                "title" : {
                    "type" : "string",
                    "analyzer" : "ik"
                }
            }
        }
    }
}'

 

然后測試

curl 'http://localhost:9200/index/_analyze?analyzer=ik&pretty=true' -d' 
{ 
    "text":"中華人民共和國國歌" 
} 
' 
{
  "tokens" : [ {
    "token" : "text",
    "start_offset" : 12,
    "end_offset" : 16,
    "type" : "ENGLISH",
    "position" : 1
  }, {
    "token" : "中華人民共和國",
    "start_offset" : 19,
    "end_offset" : 26,
    "type" : "CN_WORD",
    "position" : 2
  }, {
    "token" : "國歌",
    "start_offset" : 26,
    "end_offset" : 28,
    "type" : "CN_WORD",
    "position" : 3
  } ]
}

想要返回最細粒度的結果, 需要在elaticsearch.yml中配置

index:
  analysis:
    analyzer:
      ik:
          alias: [ik_analyzer]
          type: org.elasticsearch.index.analysis.IkAnalyzerProvider
      ik_smart:
          type: ik
          use_smart: true
      ik_max_word:
          type: ik
          use_smart: false

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM