Lucene--FuzzyQuery與WildCardQuery(通配符)


FuzzyQuery:

創建索引:

IndexWriter writer = new IndexWriter(path, new StandardAnalyzer(), false); 

writer.setUseCompoundFile(false);

  

Document doc1 = new Document(); 

Document doc2 = new Document(); 

Document doc3 = new Document(); 

Document doc4 = new Document(); 

Document doc5 = new Document(); 

Document doc6 = new Document();

  

Field f1 = new Field("content", "word", Field.Store.YES, 
Field.Index.TOKENIZED); 

Field f2 = new Field("content", "work", Field.Store.YES, 
Field.Index.TOKENIZED); 

Field f3 = new Field("content", "seed", Field.Store.YES, 
Field.Index.TOKENIZED); 

Field f4 = new Field("content", "sword", Field.Store.YES, 
Field.Index.TOKENIZED); 

Field f5 = new Field("content", "world", Field.Store.YES, 
Field.Index.TOKENIZED); 

Field f6 = new Field("content", "ford", Field.Store.YES, 
Field.Index.TOKENIZED);

  
doc1.add(f1); 
doc2.add(f2); 
doc3.add(f3); 
doc4.add(f4); 
doc5.add(f5); 
doc6.add(f6);

  
writer.addDocument(doc1); 
writer.addDocument(doc2); 
writer.addDocument(doc3); 
writer.addDocument(doc4); 
writer.addDocument(doc5); 
writer.addDocument(doc6);

  
writer.close();

  

 

注:IndexWriter中的create的變量值一般設為true

搜索:

 

IndexSearcher searcher = new IndexSearcher(path); 
//構建一個Term,然后對其進行模糊查找 

Term t = new Term("content", "work"); 

FuzzyQuery query = new FuzzyQuery(t); 
//FuzzyQuery還有兩個構造函數,來限制模糊匹配的程度 
// 在FuzzyQuery中,默認的匹配度是0.5,當這個值越小時,通過模糊查找出的文檔的匹配程度就 
// 越低,查出的文檔量就越多,反之亦然 

FuzzyQuery query1 = new FuzzyQuery(t, 0.1f); 

FuzzyQuery query2 = new FuzzyQuery(t, 0.1f, 1); 
Hits hits = searcher.search(query2); 

for (int i = 0; i  < hits.length(); i++) { 

    System.out.println(hits.doc(i)); 
} 
searcher.close();

  

模糊搜索的三種構造函數,具體講一下參數的用法(以第三個為例);

第一個參數當然是詞條對象,第二個參數指的是levenshtein算法的最小相似度,第三個參數指的是要有多少個前綴字母完全匹配:

 

WildCardQuery:

通配符就更簡單了,只要知道“*”表示0到多個字符,而使用“?”表示一個字符就行了:

 

IndexSearcher searcher=new IndexSearcher(path);

Term t1=new Term("content","?o*");

WildcardQuery query=new WildcardQuery(t1);
Hits hits=searcher.search(query);

for(int i=0;i<hits.length();i++)
{

     System.out.println(hits.doc(i));
}

  

That“s all!


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM