solr 之 synonyms.txt stopwords.txt


前提:索引和搜索都會加Factory

1. 如果是StandardTokenizerFactory  那么查詢時,synonyms.txt只能配置單個詞或者類似 植物 》 動物  但不能 ” 英雄 》植物“ 因為StandardTokenizerFactory   中文,默認就是字字分開,直接控制匹配度就行,要詞分的話就用ik。

2. 當然對於WhitespaceTokenizerFactory ,那么” 英雄 》植物“ 是完全沒有問題的(分詞應該也沒有問題哈)!!

3. 對於stopwords.txt 例如里面加一個“一”,那么搜索時都會將它忽略!!

4. 詳解:

# blank lines and lines starting with pound are comments.   #Explicit mappings match any token sequence on the LHS of "=>" #and replace with all alternatives on the RHS. These types of mappings  #ignore the expand parameter in the schema.  #Examples:  #-----------------------------------------------------------------------  #some test synonym mappings unlikely to appear in real input text  aaafoo => aaabar bbbfoo => bbbfoo bbbbar cccfoo => cccbar cccbaz fooaaa,baraaa,bazaaa # Some synonym groups specific to this example  GB,gib,gigabyte,gigabytes MB,mib,megabyte,megabytes Television, Televisions, TV, TVs #notice we use "gib" instead of "GiB" so any WordDelimiterFilter coming  #after us won't split it into two words.  飛利浦刮胡刀,飛利浦剃須刀 # Synonym mappings can be used for spelling correction too  pixima => pixma a\,a => b\,b 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM