es之Source字段和store字段

本文轉載自查看原文 2017-05-22 22:57 1292 ELK（elasticsearch、logstash、kibana）

PUT /website/blog/123

{

  "title" : "elasticsearchshi是是什么",

  "author" : "zhangsan",

  "titleScore" : 66.666

}

在檢索出數據之后，我們觀察有一個_source這樣的字段，

【注意】_source字段在我們檢索時非常重要；

Es除了將數據保存在倒排索引中，另外還有一分原始文檔

原始文檔就是存儲在_source中的；

其實我們在elasticsearch中搜索文檔，查看文檔的內容就是_source中的內容

我們可以在設置mapping的過程中將source字段開啟或者關閉：

PUT weisite
{
  "mappings":{  
         "article":{  
           "_source": {"enabled": true},  
             "properties":{  
                 "id":{"type": "text", "store": true },  
                 "title":{"type": "text","store": true}, 
                 "readCounts":{"type": "integer","store": true},  
                 "times": {"type": "date", "index": "false"}
             }  
         }  
   }  
}

那么source字段有什么作用

ID _source	倒排索引	ID 原始文檔
1 {‘我愛中國’}	我愛[1,2,3] 中國[1]	1 我愛中國
2 {‘我愛游戲’}	游戲[2]	2 我愛游戲
3 {‘我愛游戲’}	愛[1,2,3]	3 我啥都愛

1、如果我們關閉source字段，也就是enable:false，那么在檢索過程中會根據關鍵字比如”游戲”去倒排索引【記錄了詞項和文檔之間的對應關系】中查詢文檔的ID，但是source字段的enable:false，那么原始文檔中沒有這些內容，就只能回顯文檔的ID，字段內容是找不到的

2、如果我們開啟source字段，也就是enable:true,那么在檢索過程過程中，客戶端只需要解析存儲的source JSON串，不要通過倒排索引表去檢索，僅需要一次IO，就可以返回整個文檔的結果

【注意】：

source字段默認是存儲的，什么情況下不用保留source字段？如果某個字段內容非常多，業務里面只需要能對該字段進行搜索，最后返回文檔id，查看文檔內容會再次到mysql或者hbase中取數據

把大字段的內容存在Elasticsearch中只會增大索引，這一點文檔數量越大結果越明顯，如果一條文檔節省幾KB，放大到億萬級的量結果也是非常可觀的。

如果想要關閉_source字段，在mapping中的設置如下:

PUT weisite
{
  "mappings":{  
         "article":{  
           "_source": {"enabled": false},  
             "properties":{  
                 "id":{"type": "text", "store": true },  
                 "title":{"type": "text","store": true}, 
                 "readCounts":{"type": "integer","store": true},  
                 "times": {"type": "date", "index": "false"}
             }  
         }  
   }  
}

GET /weisite/article/1
GET /weisite/article/_search
{
    "query": {
        "match_phrase": {
            "title": "this"
       }
   }
}

如果只想存儲幾個字段的原始值，那么在_source屬性下還有兩個字段：include和exclude：

PUT weisite
{
  "mappings":{  
         "article":{  
           "_source": {
             "includes": [
                "title"
             ],
             "excludes": [
                "content"
             ]
           },  
             "properties":{  
                 "id":{"type": "text", "store": true },  
                 "title":{"type": "text","store": true}, 
                 "readCounts":{"type": "integer","store": true},  
                 "times": {"type": "date", "index": true},
                 "content" : {"type" : "text" , "index": true}
             }  
         }  
   }  
}

還有一個store屬性：

Store**屬性為true的時候會將指定的字段寫入索引**(然后查詢的時候使用倒排索引去查詢，相比_source多一次IO)，默認是false的；

其次是，如果想讓檢索出的字段進行高亮顯示，那么（store和source要至少保留一個）

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 elasticsearch的store屬性 vs _source字段 es exists 與 _source,展示固定字段 ES 13 - Elasticsearch的元字段 (_index、_type、_source、_routing等) 淺析ES的_source、_all、store、index ES 新增字段和刪除字段 ES提高數據壓縮的設置——單字段，去掉source和all ES _all、_source的使用——_all字段連接所有字段的值構成一個用空格（space）分隔的大string而被analyzed和index，document主體保存在_source中 es~存儲部分字段 es查詢結果字段過濾 es修改字段