Lucene教程(四) 索引的更新和刪除

本文轉載自查看原文 2018-07-24 15:46 1638 Lucene

這篇文章是基於上一篇文章來寫的，使用的是IndexUtil類，下面的例子不在貼出整個類的內容，只貼出具體的方法內容。

3.5版本：

先寫了一個check()方法來查看索引文件的變化：

/**
* 檢查一下索引文件
*/
public static void check() {
IndexReader indexReader = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
indexReader = IndexReader.open(directory);
// 通過reader可以有效的獲取到文檔的數量
// 有效的索引文檔
System.out.println( "有效的索引文檔:" + indexReader.numDocs());
// 總共的索引文檔
System.out.println( "總共的索引文檔:" + indexReader.maxDoc());
// 刪掉的索引文檔，其實不恰當，應該是在回收站里的索引文檔
System.out.println( "刪掉的索引文檔:" + indexReader.numDeletedDocs());
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexReader != null) {
indexReader.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

那么就下來就先跑一下建立索引方法，然后在執行以下check()方法，看看結果：

有效的索引文檔 :3
總共的索引文檔 :3
刪掉的索引文檔 :0

接下來我想刪除一個索引，例子如下：

/**
* 刪除索引
*/
public static void delete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_35, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* 參數是一個選項，可以是一個Query，也可以是一個term，term是一個精確查找的值
*
* 此時刪除的文檔並不會被完全刪除，而是存儲在一個回收站中的，可以恢復
*/
// 方式一：通過Term刪除
/**
* 注意Term構造器的意思，第一個參數為Field，第二個參數為Field的值
*/
indexWriter.deleteDocuments( new Term("id", "1"));
// 方式二：通過Query刪除
/**
* 這里就要造一個Query出來，刪掉查處的索引
*/
QueryParser queryParser = new QueryParser(Version.LUCENE_35, "content", analyzer);
// 創建Query表示搜索域為content包含Lucene的文檔
Query query = queryParser.parse( "Lucene");
// indexWriter.deleteDocuments(query);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

看看測試：

@Test
public void testDelete() {
IndexUtil.delete();
IndexUtil.check();
}

執行過后：

有效的索引文檔 :2
總共的索引文檔 :3
刪掉的索引文檔 :1

此時被刪掉的文檔跑到了回收站中，並沒有被徹底刪除，我們上面使用的是刪term的方式，那么使用query刪行不行呢，那么現在把注釋換一換：

// indexWriter.deleteDocuments(new Term("id", "1"));
// 方式二：通過Query刪除
/**
* 這里就要造一個Query出來，刪掉查處的索引
*/
QueryParser queryParser = new QueryParser(Version.LUCENE_35, "content", analyzer);
// 創建Query表示搜索域為content包含Lucene的文檔
Query query = queryParser.parse( "Lucene");
indexWriter.deleteDocuments(query);

再跑一下測試方法：

有效的索引文檔: 1
總共的索引文檔: 3
刪掉的索引文檔: 2

看看，被刪除的文檔又多了一個，因為我們query查出的文檔和id為1的文檔不是同一個，目前了解了刪除的兩種方式怎么使用了吧。
我現在發現刪錯了，想恢復怎么辦，那么我們就來看看怎么恢復刪除的索引：

/**
* 恢復刪除的索引
*/
public static void unDelete() {
// 使用IndexReader進行恢復
IndexReader indexReader = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
// 恢復時，必須把IndexReader的只讀(readOnly)設置為false
// 索引沒有改變可以使用true，但現在是恢復刪除的索引，顯然是改變過的，所以只能是false
indexReader = IndexReader.open(directory, false);
indexReader.undeleteAll();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexReader != null) {
indexReader.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

跑一下測試：

@Test
public void testUnDelete() {
IndexUtil.unDelete();
IndexUtil.check();
}

結果為：

有效的索引文檔 :3
總共的索引文檔 :3
刪掉的索引文檔 :0

全部恢復了吧，很不錯吧

但是我現在有發現剛才沒有刪錯，我要把索引徹底刪除，怎么弄呢，我們回過頭來再試，我現在吧刪除索引的兩種方式的注釋都打開，執行一下刪除方法是不是得到這樣的結果啊：

有效的索引文檔 :1
總共的索引文檔 :3
刪掉的索引文檔 :2

然后看看徹底刪除的代碼：

/**
* 強制刪除
*/
public static void forceDelete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_35, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
indexWriter.forceMergeDeletes();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

執行一下測試代碼：

@Test
public void testForceDelete() {
IndexUtil.forceDelete();
IndexUtil.check();
}

結果如下：

有效的索引文檔 :1
總共的索引文檔 :1
刪掉的索引文檔 :0

此時兩個索引文檔被徹底的刪掉了。這么長都在講刪除的事，那么Lucene是怎么更新索引的呢，記下來看看是如何更新索引的：

注：先把索引文件刪除，重新建索引

/**
* 更新索引
*/
public static void update() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_35, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* Lucene並沒有提供更新，這里的更新操作其實是如下兩個操作的合集先刪除之后再添加
*/
Document document = new Document();
document.add( new Field("id", "11", Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
document.add( new Field("author", authors[0], Field.Store.YES, Field.Index.NOT_ANALYZED));
document.add( new Field("title", titles[0], Field.Store.YES, Field.Index.ANALYZED));
document.add( new Field("content", contents[1], Field.Store.NO, Field.Index.ANALYZED));
indexWriter.updateDocument( new Term("id", "1"), document);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

注意上邊這段代碼，我使用的content是id為2的content，它包含“Lucene”，我一會要用它測試，注意比對結果

此時執行一下更新索引：

@Test
public void testUpdate() {
IndexUtil.update();
IndexUtil.check();
}

結果為：

有效的索引文檔 :3
總共的索引文檔 :4
刪掉的索引文檔 :1

結果是這樣的，驚訝嗎，我們一起來算算，有效的文檔刪掉一個添加一個是不是3個，沒錯吧，總共的文檔數是三個加一個，引文刪掉的文檔也算啊，沒有徹底刪掉，在回收站里，然后我們執行一下search()方法，看看結果：

/**
* 搜索
*/
public static void search() {
IndexReader indexReader = null;
try {
// 1、創建Directory
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
// 2、創建IndexReader
indexReader = IndexReader.open(directory);
// 3、根據IndexReader創建IndexSearch
IndexSearcher indexSearcher = new IndexSearcher(indexReader);
// 4、創建搜索的Query
// 使用默認的標准分詞器
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
// 在content中搜索Lucene
// 創建parser來確定要搜索文件的內容，第二個參數為搜索的域
QueryParser queryParser = new QueryParser(Version.LUCENE_35, "content", analyzer);
// 創建Query表示搜索域為content包含Lucene的文檔
Query query = queryParser.parse( "Lucene");
// 5、根據searcher搜索並且返回TopDocs
TopDocs topDocs = indexSearcher.search(query, 10);
// 6、根據TopDocs獲取ScoreDoc對象
ScoreDoc[] scoreDocs = topDocs.scoreDocs;
for (ScoreDoc scoreDoc : scoreDocs) {
// 7、根據searcher和ScoreDoc對象獲取具體的Document對象
Document document = indexSearcher.doc(scoreDoc.doc);
// 8、根據Document對象獲取需要的值
System.out.println( "id : " + document.get("id"));
System.out.println( "author : " + document.get("author"));
System.out.println( "title : " + document.get("title"));
/**
* 看看content能不能打印出來，為什么？
*/
System.out.println( "content : " + document.get("content"));
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexReader != null) {
indexReader.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

@Test
public void testSearch() {
IndexUtil.search();
}

id : 2
author : Tony
title : Hello Lucene
content : null
id : 11
author : Darren
title : Hello World
content : null

查出來了兩條，說明更新成功了

我再把id為3的索引也更新一下：

Document document = new Document();
document.add( new Field("id", "11", Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
document.add( new Field("author", authors[0], Field.Store.YES, Field.Index.NOT_ANALYZED));
document.add( new Field("title", titles[0], Field.Store.YES, Field.Index.ANALYZED));
document.add( new Field("content", contents[1], Field.Store.NO, Field.Index.ANALYZED));
indexWriter.updateDocument( new Term("id", "3"), document);

執行一下update()方法，看看結果：

有效的索引文檔 :3
總共的索引文檔 :5
刪掉的索引文檔 :2

問題來了，隨着索引文件更新次數的增加，索引文件是不是會越來越多啊，那我們是不是有辦法合並一下優化一下呢，下面來看Lucene是怎么合並索引文件的：

/**
* 合並索引
*/
public static void merge() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_35, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
// 會將索引合並為2段，這兩段中的被刪除的數據會被清空
/**
* 特別注意：
*
* 此處Lucene在3.5之后不建議使用，因為會消耗大量的開銷，Lucene會根據情況自動處理的
*/
// 把索引合並為兩段
indexWriter.forceMerge( 2);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

執行一下測試：

@Test
public void testMerge() {
IndexUtil.merge();
IndexUtil.check();
}

結果為：

有效的索引文檔 :3
總共的索引文檔 :3
刪掉的索引文檔 :0

索引文件數恢復正常了，這里有個問題，Lucene的合並索引方法或優化索引方法不建議人為調用，會消耗很多資源，並且Lucene會自動優化索引，索引不用擔心索引文件一直變大變多這個問題。

4.5版本：

首先看看check()方法，和3.5版本一樣：

/**
* 檢查一下索引文件
*/
public static void check() {
DirectoryReader directoryReader = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
directoryReader = DirectoryReader.open(directory);
// 通過reader可以有效的獲取到文檔的數量
// 有效的索引文檔
System.out.println( "有效的索引文檔:" + directoryReader.numDocs());
// 總共的索引文檔
System.out.println( "總共的索引文檔:" + directoryReader.maxDoc());
// 刪掉的索引文檔，其實不恰當，應該是在回收站里的索引文檔
System.out.println( "刪掉的索引文檔:" + directoryReader.numDeletedDocs());
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (directoryReader != null) {
directoryReader.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

接下來看看刪除方法：

/**
* 刪除索引
*/
public static void delete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_45, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* 參數是一個選項，可以是一個Query，也可以是一個term，term是一個精確查找的值
*
* 此時刪除的文檔並不會被完全刪除，而是存儲在一個回收站中的，可以恢復
*/
// 方式一：通過Term刪除
/**
* 注意Term構造器的意思，第一個參數為Field，第二個參數為Field的值
*/
indexWriter.deleteDocuments( new Term("id", "1"));
// 方式二：通過Query刪除
/**
* 這里就要造一個Query出來，刪掉查處的索引
*/
QueryParser queryParser = new QueryParser(Version.LUCENE_45, "content", analyzer);
// 創建Query表示搜索域為content包含Lucene的文檔
Query query = queryParser.parse( "Lucene");
// indexWriter.deleteDocuments(query);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

然后我們跑一下測試看看結果：記住要跑一下索引方法

@Test
public void testDelete() {
IndexUtil.delete();
IndexUtil.check();
}

有效的索引文檔 :3
總共的索引文檔 :3
刪掉的索引文檔 :0

沒有刪掉，為什么，經網絡搜索，發現有人遇到了這個問題，解釋是這樣的，我現在是按term刪，但是刪除的term的text類型和建索引時的不一樣，他其實是找不到這個term對應的內容，修改一下建立索引的方法：

把這段邏輯

FieldType idType = new FieldType();
idType.setStored( true);
idType.setIndexed( false);
idType.setOmitNorms( false);
document.add( new Field("id", ids[i], idType));

改為：

document.add(new Field("id", ids[i], StringField.TYPE_STORED));

這樣Id就是使用StringField去建立的索引，和我們term里的第二個參數類型一樣了，再來試試

有效的索引文檔 :2
總共的索引文檔 :3
刪掉的索引文檔 :1

現在可以了，但是這里就有問題了，我們使用的預定義的類型，這種類型是不可改的，我就不能對id使用自定義類型了，這不就不如3.5靈活了嗎，不知道有沒有人有什么高見。

接下來看看恢復方法：

/**
* 恢復刪除的索引
*/
public static void unDelete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_45, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* 注意：和3.5版本不同，不再使用IndexReader恢復刪除的索引，而是使用IndexWriter的rollback()方法
*/
indexWriter.rollback();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

這里遇到了另外的問題，無法恢復，暫時還不知道原因

接下來看看強制刪除：

/**
* 強制刪除
*/
public static void forceDelete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_45, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
indexWriter.forceMergeDeletes();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

@Test
public void testForceDelete() {
IndexUtil.forceDelete();
IndexUtil.check();
}

有效的索引文檔 :2
總共的索引文檔 :2
刪掉的索引文檔 :0

結果是正確的

我們來看看更新方法：

/**
* 更新索引
*/
public static void update() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_45, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* Lucene並沒有提供更新，這里的更新操作其實是如下兩個操作的合集先刪除之后再添加
*/
Document document = new Document();
document.add( new Field("id", "11", StringField.TYPE_STORED));
document.add( new Field("author", authors[0], StringField.TYPE_STORED));
document.add( new Field("title", titles[0], StringField.TYPE_STORED));
document.add( new Field("content", contents[1], TextField.TYPE_NOT_STORED));
indexWriter.updateDocument( new Term("id", "1"), document);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

注意：這里有個問題和刪除是一樣的，就是對id建索引是所使用的類型和刪除時的保持一致，否則就會查不到，就變成了添加索引而不刪除索引

這里還是重新建索引再測試

@Test
public void testUpdate() {
IndexUtil.update();
IndexUtil.check();
}

有效的索引文檔 :3
總共的索引文檔 :4
刪掉的索引文檔 :1

更新了一條，那么我們把id為3的也更新：

Document document = new Document();
document.add( new Field("id", "33", StringField.TYPE_STORED));
document.add( new Field("author", authors[0], StringField.TYPE_STORED));
document.add( new Field("title", titles[0], StringField.TYPE_STORED));
document.add( new Field("content", contents[1], TextField.TYPE_NOT_STORED));
indexWriter.updateDocument( new Term("id", "3"), document);

再測：

有效的索引文檔 :3
總共的索引文檔 :5
刪掉的索引文檔 :2

結果都是正確的，那么我們合並一下：

/**
* 合並索引
*/
public static void merge() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open( new File("F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_45, analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
// 會將索引合並為2段，這兩段中的被刪除的數據會被清空
/**
* 特別注意：
*
* 此處Lucene在3.5之后不建議使用，因為會消耗大量的開銷，Lucene會根據情況自動處理的
*/
// 把索引合並為兩段
indexWriter.forceMerge( 2);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

@Test
public void testMerge() {
IndexUtil.merge();
IndexUtil.check();
}

有效的索引文檔 :3
總共的索引文檔 :3
刪掉的索引文檔 :0

結果和3.5版本的一致

5.0版本：

先看刪除方法：

/**
* 刪除索引
*/
public static void delete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open(FileSystems.getDefault().getPath( "F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* 參數是一個選項，可以是一個Query，也可以是一個term，term是一個精確查找的值
*
* 此時刪除的文檔並不會被完全刪除，而是存儲在一個回收站中的，可以恢復
*/
// 方式一：通過Term刪除
/**
* 注意Term構造器的意思，第一個參數為Field，第二個參數為Field的值
*/
indexWriter.deleteDocuments( new Term("id", "1"));
// 方式二：通過Query刪除
/**
* 這里就要造一個Query出來，刪掉查處的索引
*/
QueryParser queryParser = new QueryParser("content", analyzer);
// 創建Query表示搜索域為content包含Lucene的文檔
Query query = queryParser.parse( "Lucene");
// indexWriter.deleteDocuments(query);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

跑一下測試：

@Test
public void testDelete() {
IndexUtil.delete();
IndexUtil.check();
}

有效的索引文檔 :2
總共的索引文檔 :3
刪掉的索引文檔 :1

她解決了4.5版本中的一個問題，非要建立索引的id的類型和term參數類型一致的問題。

恢復邏輯如下：

/**
* 恢復刪除的索引
*/
public static void unDelete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open(FileSystems.getDefault().getPath( "F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* 注意：和3.5版本不同，不再使用IndexReader恢復刪除的索引，而是使用IndexWriter的rollback()方法
*/
indexWriter.rollback();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

目前和4.5版本有一樣的問題，恢復不了，等待繼續研究去解決這個問題。

其他代碼如下：

/**
* 更新索引
*/
public static void update() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open(FileSystems.getDefault().getPath( "F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
/**
* Lucene並沒有提供更新，這里的更新操作其實是如下兩個操作的合集先刪除之后再添加
*/
Document document = new Document();
document.add( new Field("id", "33", StringField.TYPE_STORED));
document.add( new Field("author", authors[0], StringField.TYPE_STORED));
document.add( new Field("title", titles[0], StringField.TYPE_STORED));
document.add( new Field("content", contents[1], TextField.TYPE_NOT_STORED));
indexWriter.updateDocument( new Term("id", "1"), document);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
/**
* 檢查一下索引文件
*/
public static void check() {
DirectoryReader directoryReader = null;
try {
Directory directory = FSDirectory.open(FileSystems.getDefault().getPath( "F:/test/lucene/index"));
directoryReader = DirectoryReader.open(directory);
// 通過reader可以有效的獲取到文檔的數量
// 有效的索引文檔
System.out.println( "有效的索引文檔:" + directoryReader.numDocs());
// 總共的索引文檔
System.out.println( "總共的索引文檔:" + directoryReader.maxDoc());
// 刪掉的索引文檔，其實不恰當，應該是在回收站里的索引文檔
System.out.println( "刪掉的索引文檔:" + directoryReader.numDeletedDocs());
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (directoryReader != null) {
directoryReader.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
/**
* 合並索引
*/
public static void merge() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open(FileSystems.getDefault().getPath( "F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
// 會將索引合並為2段，這兩段中的被刪除的數據會被清空
/**
* 特別注意：
*
* 此處Lucene在3.5之后不建議使用，因為會消耗大量的開銷，Lucene會根據情況自動處理的
*/
// 把索引合並為兩段
indexWriter.forceMerge( 2);
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
/**
* 強制刪除
*/
public static void forceDelete() {
IndexWriter indexWriter = null;
try {
Directory directory = FSDirectory.open(FileSystems.getDefault().getPath( "F:/test/lucene/index"));
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriter = new IndexWriter(directory, indexWriterConfig);
indexWriter.forceMergeDeletes();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (indexWriter != null) {
indexWriter.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

這些代碼和4.5版本差別不大，運行結果和4.5版本也是一樣的，就不再一一講解

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 在clickhouse中更新和刪除 MySQL之視圖的更新和刪除 Hibernate 數據的保存，更新和刪除 mysql---級聯更新和刪除操作 MongoDB創建更新和刪除文檔 SQL圖形化操作設置級聯更新和刪除 vsftpd更新和修改版本號教程 pymongo創建索引、更新、刪除 Elasticsearch 索引、更新、刪除文檔 Solr 新增、更新、刪除索引