orchard之lucene.net索引生成


orchard是微軟自己團隊搞的cms,園子里有很多園友已經對系統結構詳細分析了,但是對里面的某些模塊沒有一一分析,因為需要使用lucene.net做站內搜索,所以參考學習一下,ps一下lucene.net已經是2.9.4版本了,雖然還在孵化器,但是還是更新了,不容易啊。

點開Modules開啟lucene相關應用,如下圖。

先自己在后台發幾篇文章,其實在發文章的同時,orchard的消息監聽機制就已經自動從消息隊列中取出消息然后自動生成索引了。這里分析一下索引過程。

點開Settings里的Search Index菜單

現在索引里已經包含4條文檔了,點擊Update的時候會重新生成索引,流程如下。

 

在Modules里的Orchard.Indexing里的Controllers下的AdminController

        [HttpPost]
        public ActionResult Update() {
            if (!Services.Authorizer.Authorize(StandardPermissions.SiteOwner, T("Not allowed to manage the search index.")))
                return new HttpUnauthorizedResult();
          //更新索引,DefaultIndexName為索引文件夾名稱
            _indexingService.UpdateIndex(DefaultIndexName);

            return RedirectToAction("Index");
        }

Orchard.Indexing.Services.IndexingService並不少直接生成索引,而是從消息通知里獲取通知后才生成索引的,如下。

        public void UpdateIndex(string indexName) {
            //獲取消息通知里索引生成通知才生成索引
            foreach(var handler in _indexNotifierHandlers) {
                handler.UpdateIndex(indexName);
            }
            //生成后將信息通過消息通知傳給前台顯示
            Services.Notifier.Information(T("The search index has been updated."));
        }

將索引生成的消息通知給索引生成程序后還是不能生成索引,而是將這個消息傳給生成索引的計划任務程序Orchard.Indexing.Services.UpdateIndexScheduler,在這里繼續生成索引之旅。

//將生成索引這件事添加到計划任務
public void Schedule(string indexName) {
            var shellDescriptor = _shellDescriptorManager.GetShellDescriptor();
            _processingEngine.AddTask(
                _shellSettings,
                shellDescriptor,
                "IIndexNotifierHandler.UpdateIndex",
                new Dictionary<string, object> { { "indexName", indexName } }
            );
        }

        public void UpdateIndex(string indexName) {
            if(_indexingTaskExecutor.Value.UpdateIndexBatch(indexName)) {           //將生成索引的任務交給它去處理,上面的方法
                Schedule(indexName);
            }
        }

添加到計划任務后,他們之間的傳遞關系就只能通過讀取消息隊列來繼續了。核心在這里。

Orchard.Indexing.Services.IndexingTaskExecutor,真正處理索引任務的類,這個類會加載到內存,通過心跳方式讀取消息隊列,如果有新的生成索引任務就執行如下代碼。

View Code
        /// <summary>
        /// Indexes a batch of content items
        /// </summary>
        /// <returns>
        /// <c>true</c> if there are more items to process; otherwise, <c>false</c>.
        /// </returns>
        private bool BatchIndex(string indexName, string settingsFilename, IndexSettings indexSettings) {
            var addToIndex = new List<IDocumentIndex>();
            var deleteFromIndex = new List<int>();

            // Rebuilding the index ?
            if (indexSettings.Mode == IndexingMode.Rebuild) {
                Logger.Information("Rebuilding index");
                _indexingStatus = IndexingStatus.Rebuilding;

                // load all content items
                var contentItems = _contentRepository
                    .Fetch(
                        versionRecord => versionRecord.Published && versionRecord.Id > indexSettings.LastContentId,
                        order => order.Asc(versionRecord => versionRecord.Id))
                    .Take(ContentItemsPerLoop)
                    .Select(versionRecord => _contentManager.Get(versionRecord.ContentItemRecord.Id, VersionOptions.VersionRecord(versionRecord.Id)))
                    .Distinct()
                    .ToList();

                // if no more elements to index, switch to update mode
                if (contentItems.Count == 0) {
                    indexSettings.Mode = IndexingMode.Update;
                }

                foreach (var item in contentItems) {
                    try {
                        IDocumentIndex documentIndex = ExtractDocumentIndex(item);

                        if (documentIndex != null && documentIndex.IsDirty) {
                            addToIndex.Add(documentIndex);
                        }

                        indexSettings.LastContentId = item.VersionRecord.Id;
                    }
                    catch (Exception ex) {
                        Logger.Warning(ex, "Unable to index content item #{0} during rebuild", item.Id);
                    }
                }
            }

            if (indexSettings.Mode == IndexingMode.Update) {
                Logger.Information("Updating index");
                _indexingStatus = IndexingStatus.Updating;

                var indexingTasks = _taskRepository
                    .Fetch(x => x.Id > indexSettings.LastIndexedId)
                    .OrderBy(x => x.Id)
                    .Take(ContentItemsPerLoop)
                    .GroupBy(x => x.ContentItemRecord.Id)
                    .Select(group => new {TaskId = group.Max(task => task.Id), Delete = group.Last().Action == IndexingTaskRecord.Delete, Id = group.Key, ContentItem = _contentManager.Get(group.Key, VersionOptions.Published)})
                    .OrderBy(x => x.TaskId)
                    .ToArray();

                foreach (var item in indexingTasks) {
                    try {
                        // item.ContentItem can be null if the content item has been deleted
                        IDocumentIndex documentIndex = ExtractDocumentIndex(item.ContentItem);

                        if (documentIndex == null || item.Delete) {
                            deleteFromIndex.Add(item.Id);
                        }
                        else if (documentIndex.IsDirty) {
                            addToIndex.Add(documentIndex);
                        }

                        indexSettings.LastIndexedId = item.TaskId;
                    }
                    catch (Exception ex) {
                        Logger.Warning(ex, "Unable to index content item #{0} during update", item.Id);
                    }
                }
            }

            // save current state of the index
            indexSettings.LastIndexedUtc = _clock.UtcNow;
            _appDataFolder.CreateFile(settingsFilename, indexSettings.ToXml());

            if (deleteFromIndex.Count == 0 && addToIndex.Count == 0) {
                // nothing more to do
                _indexingStatus = IndexingStatus.Idle;
                return false;
            }

            // save new and updated documents to the index
            try {
                if (addToIndex.Count > 0) {
                    _indexProvider.Store(indexName, addToIndex);
                    Logger.Information("Added content items to index: {0}", addToIndex.Count);
                }
            }
            catch (Exception ex) {
                Logger.Warning(ex, "An error occured while adding a document to the index");
            }

            // removing documents from the index
            try {
                if (deleteFromIndex.Count > 0) {
                    _indexProvider.Delete(indexName, deleteFromIndex);
                    Logger.Information("Added content items to index: {0}", addToIndex.Count);
                }
            }
            catch (Exception ex) {
                Logger.Warning(ex, "An error occured while removing a document from the index");
            }

            return true;
        }

其中重要的一點是從Task中取出索引任務然后添加到lucene文檔

          var indexingTasks = _taskRepository
                    .Fetch(x => x.Id > indexSettings.LastIndexedId)
                    .OrderBy(x => x.Id)
                    .Take(ContentItemsPerLoop)
                    .GroupBy(x => x.ContentItemRecord.Id)
                    .Select(group => new {TaskId = group.Max(task => task.Id), Delete = group.Last().Action == IndexingTaskRecord.Delete, Id = group.Key, ContentItem = _contentManager.Get(group.Key, VersionOptions.Published)})
                    .OrderBy(x => x.TaskId)
                    .ToArray();

                foreach (var item in indexingTasks) {
                    try {
                        // item.ContentItem can be null if the content item has been deleted
                        IDocumentIndex documentIndex = ExtractDocumentIndex(item.ContentItem);

                        if (documentIndex == null || item.Delete) {
                            deleteFromIndex.Add(item.Id);
                        }
else if (documentIndex.IsDirty) { addToIndex.Add(documentIndex); } indexSettings.LastIndexedId = item.TaskId; } catch (Exception ex) { Logger.Warning(ex, "Unable to index content item #{0} during update", item.Id); } }

處理完文檔過后存儲文檔到索引的代碼如下:

            // save new and updated documents to the index
            try {
                if (addToIndex.Count > 0) {
                 //將文檔存儲到索引
                    _indexProvider.Store(indexName, addToIndex);
                    Logger.Information("Added content items to index: {0}", addToIndex.Count);
                }
            }
            catch (Exception ex) {
                Logger.Warning(ex, "An error occured while adding a document to the index");
            }

最終的索引存儲處理在Lucene.Services.LuceneIndexProvider

        public void Store(string indexName, IEnumerable<LuceneDocumentIndex> indexDocuments) {
            if (indexDocuments.AsQueryable().Count() == 0) {
                return;
            }

            // Remove any previous document for these content items
            Delete(indexName, indexDocuments.Select(i => i.ContentItemId));

            var writer = new IndexWriter(GetDirectory(indexName), _analyzer, false, IndexWriter.MaxFieldLength.UNLIMITED);
            LuceneDocumentIndex current = null;

            try {

                foreach (var indexDocument in indexDocuments) {
                    current = indexDocument;
                     //將自定義的indexDocument處理成lucene的文檔
                    var doc = CreateDocument(indexDocument);

                    writer.AddDocument(doc);
                    Logger.Debug("Document [{0}] indexed", indexDocument.ContentItemId);
                }
            }
            catch (Exception ex) {
                Logger.Error(ex, "An unexpected error occured while add the document [{0}] from the index [{1}].", current.ContentItemId, indexName);
            }
            finally {
                writer.Optimize();
                writer.Close();
            }
        }

 

至此lucene的索引算是創建完畢,但是中間的一系列消息和任務之間的傳遞細節還需要進一步深入學習,錯誤之處希望園友們能夠給予指正。

獨立博客:http://www.jqpress.com/ 歡迎參觀


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM