lucene.net是什么

Apache Lucene.net是一個高性能（high-performance）的全能的全文檢索（full-featured text search engine）的搜索引擎框架庫，完全（entirely）使用C#開發。它是一種技術（technology），適合於（suitable for）幾乎（nearly）任何一種需要全文檢索（full-text search）的應用，特別是跨平台（cross-platform）的應用。

lucene.net特征

可擴展的高性能的索引能力（Scalable, High-Performance Indexing）

√ 超過20M/分鍾的處理能力（Pentium M 1.5GHz）

√ 很少的RAM內存需求，只需要1MB heap

√ 增量索引（incremental indexing）的速度與批量索引（batch indexing）的速度一樣快

√ 索引的大小粗略（roughly）為被索引的文本大小的20-30%

強大的精確的高效率的檢索算法（Powerful, Accurate and Efficient Search Algorithms）

√ 分級檢索（ranked searching）能力，最好的結果優先推出在前面

√ 很多強大的query種類：phrase queries, wildcard queries, proximity queries, range queries等

√ 支持域檢索（fielded searching），如標題、作者、正文等

√ 支持日期范圍檢索（date-range searching）

√ 可以按任意域排序（sorting by any field）

√ 支持多個索引的檢索（multiple-index searching）並合並結果集（merged results）

√ 允許更新和檢索（update and searching）並發進行（simultaneous）

lucene.net類說明

using Lucene.Net.Documents
- 提供一個簡單的Document類，一個document只不過包括一系列的命名了（named）的Fields（域），它們的內容可以是文本（strings）也可以是一個java.io.Reader的實例。
using Lucene.Net.Analysis
- 定義了一個抽象的Analyser API，用於將text文本從一個Reader轉換成一個TokenStream，即包括一些Tokens的枚舉容器（enumeration）。一個TokenStream的組成（compose）是通過在一個Tokenizer的輸出的結果上再應用TokenFilters生成的。一些少量的Analysers實現已經提供，包括StopAnalyzer和基於語法（gramar-based）分析的StandardAnalyzer。
using Lucene.Net.Index;
- 提供兩個主要類，一個是IndexWriter用於創建索引並添加文檔（document），另一個是IndexReader用於訪問索引中的數據。
using Lucene.Net.QueryParsers;
- 實現一個QueryParser。
using Lucene.Net.Search;
- 提供數據結構（data structures）來呈現（represent）查詢（queries）：TermQuery用於單個的詞（individual words），PhraseQuery用於短語，BooleanQuery用於通過boolean關系組合（combinations）在一起的queries。而抽象的Searcher用於轉變queries為命中的結果（hits）。IndexSearcher實現了在一個單獨（single）的IndexReader上檢索。
using Lucene.Net.Store;
- 定義了一個抽象的類用於存儲呈現的數據(storing persistent data)，即Directory（目錄），一個收集器（collection）包含了一些命名了的文件（named files），它們通過一個IndexOutput來寫入，以及一個IndexInput來讀取。提供了兩個實現，FSDirectory使用一個文件系統目錄來存儲文件，而另一個RAMDirectory則實現了將文件當作駐留內存的數據結構（memory-resident data structures）。
using Lucene.Net.Util;
- 包含了一小部分有用（handy）的數據結構，如BitVector和PriorityQueue等

准備工作

1.去 http://lucenenet.apache.org/ 下載lunece.net 文件

2.在項目添加引用Lucene.Net.dll

Hello World

下面是一段簡單的代碼展示如何使用Lucene.net來進行索引和檢索

 class Program
    {
        static void Main(string[] args)
        {

            //索引

            Directory direcotry = FSDirectory.GetDirectory("LuceneIndex");
            Analyzer analyzer = new StandardAnalyzer();
            IndexWriter writer = new IndexWriter(direcotry,analyzer);


            IndexReader red = IndexReader.Open(direcotry);
            int totDocs = red.MaxDoc();
            red.Close();

            //添加文檔到索引

            string text = string.Empty;
            Console.WriteLine("輸入文本你想要添加到索引:");
            Console.Write(">");

            int txts = totDocs;
            int j = 0;
            while((text=Console.ReadLine())!=string.Empty)
            {
                AddTextToIndex(txts++,text,writer);
                j++;
                Console.Write(">");

            }

            writer.Optimize();
         
            writer.Flush();
            writer.Close();

            Console.WriteLine(j + " lines added, " + txts + " documents total");

            //搜索
            IndexSearcher searcher = new IndexSearcher(direcotry);
            QueryParser parser = new QueryParser("postBody", analyzer);

            Console.WriteLine("輸入搜索的文本:");
            Console.Write(">");

            while ((text = Console.ReadLine()) != String.Empty)
            {
                Search(text, searcher, parser);
                Console.Write(">");
            }

            //關閉資源
            searcher.Close();
            direcotry.Close();
        }

        //搜索
        private static void Search(string text,IndexSearcher searcher,QueryParser parser)
        {   //條件
            Query query = parser.Parse(text);
 
            //搜索
            Hits hits = searcher.Search(query);

            //顯示結果
            Console.WriteLine("搜索 '" + text + "'");
            int results = hits.Length();
            Console.WriteLine("發現 {0} 結果", results);

            for (int i = 0; i < results; i++)
            {
                Document doc = hits.Doc(i);
                float score = hits.Score(i);
                Console.WriteLine("--結果 num {0}, 耗時 {1}", i + 1, score);
                Console.WriteLine("--ID: {0}", doc.Get("id"));
                Console.WriteLine("--Text found: {0}" + Environment.NewLine, doc.Get("postBody"));
            }
        }


        //添加文檔到索引中
        private static void AddTextToIndex(int txts,string text,IndexWriter writer)
        {
            Document doc = new Document();
            doc.Add(new Field("id",text.ToString(),Field.Store.YES,Field.Index.UN_TOKENIZED));
            doc.Add(new Field("postBody",text,Field.Store.YES,Field.Index.TOKENIZED));
            writer.AddDocument(doc);

        }
    }

運行結果

初次使用總結

列子hellowword

為了使用 lucene.net 一個應用程序需要做如下幾件事。

1.通過添加一系列fileds來創建一批document對象。

2.創建一個indexWriter對象，並且調用它的addDocument()方法來添加進documents.

3.調用queryParser.parse()處理一段文本（string）來建造一個查詢（querey）對象。

4.創建一個indexReader對象並將查詢對象傳入到他的search()方法中。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Lucene.net入門學習（結合盤古分詞） Lucene.Net + 盤古分詞 lucene.NET詳細使用與優化詳解 Lucene.net站內搜索—2、Lucene.Net簡介和分詞 Lucene.net常見功能實現知識匯總 lucene.net已經從孵化器畢業使用Lucene.Net實現全文檢索 Lucene.net站內搜索—1、SEO優化使用Lucene.net提升網站搜索速度整合記錄【netcore基礎】.Net core通過 Lucene.Net 和 jieba.NET 處理分詞搜索功能