ELK查詢命令詳解

本文轉載自查看原文 2019-12-12 15:58 636 ELKB日志

ELK查詢命令詳解

ELK查詢命令詳解

[Elasticsearch: 權威指南] https://www.elastic.co/guide/cn/elasticsearch/guide/current/search-in-depth.html

倒排索引

Elasticsearch 使用一種稱為 倒排索引 的結構，它適用於快速的全文搜索。一個倒排索引由文檔中所有不重復詞的列表構成，對於其中每個詞，有一個包含它的文檔列表。
示例：

假設文檔集合包含五個文檔，每個文檔內容如圖所示，在圖中最左端一欄是每個文檔對應的文檔編號。我們的任務就是對這個文檔集合建立倒排索引。
中文和英文等語言不同，單詞之間沒有明確分隔符號，所以首先要用分詞系統將文檔自動切分成單詞序列。這樣每個文檔就轉換為由單詞序列構成的數據流，為了系統后續處理方便，需要對每個不同的單詞賦予唯一的單詞編號，同時記錄下哪些文檔包含這個單詞，在如此處理結束后，我們可以得到最簡單的倒排索引
“單詞ID”一欄記錄了每個單詞的單詞編號，第二欄是對應的單詞，第三欄即每個單詞對應的倒排列表
索引系統還可以記錄除此之外的更多信息,下圖還記載了單詞頻率信息（TF）即這個單詞在某個文檔中的出現次數，之所以要記錄這個信息，是因為詞頻信息在搜索結果排序時，計算查詢和文檔相似度是很重要的一個計算因子，所以將其記錄在倒排列表中，以方便后續排序時進行分值計算。
倒排列表中還可以記錄單詞在某個文檔出現的位置信息.

有了這個索引系統，搜索引擎可以很方便地響應用戶的查詢，比如用戶輸入查詢詞“Facebook”，搜索系統查找倒排索引，從中可以讀出包含這個單詞的文檔，這些文檔就是提供給用戶的搜索結果，而利用單詞頻率信息、文檔頻率信息即可以對這些候選搜索結果進行排序，計算文檔和查詢的相似性，按照相似性得分由高到低排序輸出，此即為搜索系統的部分內部流程。

倒排索引原理

示例文本：

1.The quick brown fox jumped over the lazy dog
2.Quick brown foxes leap over lazy dogs in summer

倒排索引：

Term Doc_1 Doc_2
Quick |   | X
The   | X |
brown | X | X
dog   | X |
dogs  |   | X
fox   | X |
foxes |   | X
in    |   | X
jumped | X |
lazy  | X | X
leap  | | X
over  | X | X
quick | X |
summer |  | X
the   | X |

搜索quick brown ：

Term Doc_1 Doc_2
brown | X | X
quick | X |
Total | 2 | 1

計算相關度分數時，文檔1的匹配度高，分數會比文檔2高.

問題：

Quick 和 quick 以獨立的詞條出現，然而用戶可能認為它們是相同的詞。
fox 和 foxes 非常相似, 就像 dog 和 dogs ；他們有相同的詞根。
jumped 和 leap, 盡管沒有相同的詞根，但他們的意思很相近。他們是同義詞。
搜索含有 Quick fox的文檔是搜索不到的

使用標准化規則(normalization)：

建立倒排索引的時候，會對拆分出的各個單詞進行相應的處理，以提升后面搜索的時候能夠搜索到相關聯的文檔的概率

Term Doc_1 Doc_2
brown | X | X
dog 	| X | X
fox 	| X | X
in  	| | X
jump  | X | X
lazy  | X | X
over  | X | X
quick | X | X
summer | | X
the 	 | X | X

分詞器介紹及內置分詞器

分詞器：從一串文本中切分出一個一個的詞條，並對每個詞條進行標准化

包括三部分：

character filter：分詞之前的預處理，過濾掉HTML標簽，特殊符號轉換等
tokenizer: 分詞
token filter：標准化

內置分詞器：

standard 分詞器：(默認的)他會將詞匯單元轉換成小寫形式，並去除停用詞和標點符號，支持中文采用的方法為單字切分
simple 分詞器：首先會通過非字母字符來分割文本信息，然后將詞匯單元統一為小寫形式。該分析器會去掉數字類型的字符。
Whitespace 分詞器：僅僅是去除空格，對字符沒有lowcase化,不支持中文；並且不對生成的詞匯單元進行其他的標准化處理。
language 分詞器：特定語言的分詞器，不支持中文

使用ElasticSearch API 實現CRUD

添加索引：

PUT /lib/
{
  "settings":{
  "index":{
    "number_of_shards": 5,
    "number_of_replicas": 1
    }
  }
}

查看索引信息:

GET /lib/_settings
GET _all/_settings

添加文檔:

    PUT /lib/user/1
    {
      "first_name" : "jane",
      "last_name" :   "Smith",
      "age" :         32,
      "about" :       "I like to collect rock albums",
      "interests":  [ "music" ]
		}

更新文檔：將前面的年齡更新為22歲

    POST /lib/user/1
    {
      "first_name" : "jane",
      "last_name" :   "Smith",
      "age" :         22,
      "about" :       "I like to collect rock albums",
      "interests":  [ "music" ]
    }

查看文檔:

GET /lib/user/1

命令返回

#! Deprecation: [types removal] Specifying types in document get requests is deprecated, use the /{index}/_doc/{id} endpoint instead.
{
  "_index" : "lib",
  "_type" : "user",
  "_id" : "1",
  "_version" : 9,
  "_seq_no" : 9,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "first_name" : "jane",
    "last_name" : "Smith",
    "age" : 22,
    "about" : "I like to collect rock albums",
    "interests" : [
      "music"
    ]
  }
}
GET /lib/user/1?_source=age,interests

覆蓋文檔:

PUT /lib/user/1
{
    "first_name" :  "Jane",
        "last_name" :   "Smith",
        "age" :         36,
        "about" :       "I like to collect rock albums",
        "interests":  [ "music" ]
}

更新文檔：

POST /lib/user/1/_update
{
  "doc":{
  "age":33
  }
}

刪除一個文檔

DELETE /lib/user/1

刪除一個索引

DELETE /lib

批量獲取文檔

使用es提供的Multi Get API：使用Multi Get API可以通過索引名、類型名、文檔id一次得到一個文檔集合，文檔可以來自同一個索引庫，也可以來自不同索引庫.

使用curl命令：

curl 'http://192.168.25.131:9200/_mget' -d '{
"docs"：[

   {
"_index": "lib",
"_type": "user",
"_id": 1
   },
   {
 "_index": "lib",
 "_type": "user",
 "_id": 2
   }
  ]
}'

在客戶端工具中：

    GET /_mget
    {
    "docs":[
       {
           "_index": "lib",
           "_type": "user",
           "_id": 1
       },
       {
           "_index": "lib",
           "_type": "user",
           "_id": 2
       },
       {
           "_index": "lib",
           "_type": "user",
           "_id": 3
       }
      ]
     }

可以指定具體的字段：

GET /_mget
{
"docs":[
   {
       "_index": "lib",
       "_type": "user",
       "_id": 1,
       "_source": "interests"
   },
   {
       "_index": "lib",
       "_type": "user",
       "_id": 2,
       "_source": ["age","interests"]
   }
 ]
}

獲取同索引同類型下的不同文檔：

GET /lib/user/_mget
{
"docs":[
   {
       "_id": 1
   },
   {
       "_type": "user",
       "_id": 2,
   }
   
 ]
}
GET /lib/user/_mget
{
   "ids": ["1","2"]
}

使用Bulk API 實現批量操作

bulk的格式：

{action:{metadata}}
{requstbody}
action:(行為)

- create：文檔不存在時創建
- update:更新文檔
- index:創建新文檔或替換已有文檔
- delete:刪除一個文檔
- metadata：_index,_type,_id

create 和index的區別
如果數據存在，使用create操作失敗，會提示文檔已經存在，使用index則可以成功執行。
示例：

{"delete":{"_index":"lib","_type":"user","_id":"1"}}

批量添加:

POST /lib2/books/_bulk
{"index":{"_id":1}}
{"title":"Java","price":55}
{"index":{"_id":2}}
{"title":"Html5","price":45}
{"index":{"_id":3}}
{"title":"Php","price":35}
{"index":{"_id":4}}
{"title":"Python","price":50}

批量獲取:

GET /lib2/books/_mget
{
"ids": ["1","2","3","4"]
}

刪除：沒有請求體

POST /lib2/books/_bulk
{"delete":{"_index":"lib2","_type":"books","_id":4}}
{"create":{"_index":"tt","_type":"ttt","_id":"100"}}
{"name":"lisi"}
{"index":{"_index":"tt","_type":"ttt"}}
{"name":"zhaosi"}
{"update":{"_index":"lib2","_type":"books","_id":"4"}}
{"doc":{"price":58}}

bulk一次最大處理多少數據量: bulk會把將要處理的數據載入內存中，所以數據量是有限制的，最佳的數據量不是一個確定的數值，它取決於你的硬件，你的文檔大小以及復雜性，你的索引以及搜索的負載。

一般建議是1000-5000個文檔，大小建議是5-15MB，默認不能超過100M，可以在es的配置文件（即$ES_HOME下的config下的elasticsearch.yml）中。
　　

版本控制

ElasticSearch采用了樂觀鎖來保證數據的一致性，也就是說，當用戶對document進行操作時，並不需要對該document作加鎖和解鎖的操作，只需要指定要操作的版本即可。當版本號一致時，ElasticSearch會允許該操作順利執行，而當版本號存在沖突時，ElasticSearch會提示沖突並拋出異常（VersionConflictEngineException異常）。

ElasticSearch的版本號的取值范圍為1到2^63-1。
內部版本控制：使用的是_version
外部版本控制：elasticsearch在處理外部版本號時會與對內部版本號的處理有些不同。它不再是檢查_version是否與請求中指定的數值_相同_,而是檢查當前的_version是否比指定的數值小。如果請求成功，那么外部的版本號就會被存儲到文檔中的_version中。
為了保持_version與外部版本控制的數據一致
使用version_type=external

什么是Mapping?

PUT /myindex/article/1 
{ 
  "post_date": "2018-05-10", 
  "title": "Java", 
  "content": "java is the best language", 
  "author_id": 119
}

PUT /myindex/article/2
{ 
  "post_date": "2018-05-12", 
  "title": "html", 
  "content": "I like html", 
  "author_id": 120
}

PUT /myindex/article/3
{ 
  "post_date": "2018-05-16", 
  "title": "es", 
  "content": "Es is distributed document store", 
  "author_id": 110
}
GET /myindex/article/_search?q=2018-05
GET /myindex/article/_search?q=2018-05-10
GET /myindex/article/_search?q=html
GET /myindex/article/_search?q=java

查看es自動創建的mapping

GET /myindex/article/_mapping

es自動創建了index，type，以及type對應的mapping(dynamic mapping).

什么是映射：mapping定義了type中的每個字段的數據類型以及這些字段如何分詞等相關屬性

{
  "myindex": {
    "mappings": {
      "article": {
        "properties": {
          "author_id": {
            "type": "long"
          },
          "content": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "post_date": {
            "type": "date"
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

創建索引的時候,可以預先定義字段的類型以及相關屬性，這樣就能夠把日期字段處理成日期，把數字字段處理成數字，把字符串字段處理字符串值等.
支持的數據類型：

(1) 核心數據類型（Core datatypes）

字符型：string，string類型包括text 和 keyword.
	text類型被用來索引長文本，在建立索引前會將這些文本進行分詞，轉化為詞的組合，建立索引。允許es來檢索這些詞語。text類型不能用來排序和聚合。
	Keyword類型不需要進行分詞，可以被用來檢索過濾、排序和聚合。keyword 類型字段只能用本身來進行檢索

數字型：long, integer, short, byte, double, float

日期型：date

布爾型：boolean

二進制型：binary

(2) 復雜數據類型（Complex datatypes）

數組類型（Array datatype）：數組類型不需要專門指定數組元素的type，例如：
字符型數組: [ "one", "two" ]
整型數組：[ 1, 2 ]
數組型數組：[ 1, [ 2, 3 ]] 等價於[ 1, 2, 3 ]
對象數組：[ { "name": "Mary", "age": 12 }, { "name": "John", "age": 10 }]
對象類型（Object datatype）：_ object _ 用於單個JSON對象
嵌套類型（Nested datatype）：_ nested _ 用於JSON數組

(3) 地理位置類型（Geo datatypes）

地理坐標類型（Geo-point datatype）：_ geo_point _ 用於經緯度坐標；
地理形狀類型（Geo-Shape datatype）：_ geo_shape _ 用於類似於多邊形的復雜形狀；

(4) 特定類型（Specialised datatypes）

IPv4 類型（IPv4 datatype）：_ ip _ 用於IPv4 地址；

Completion 類型（Completion datatype）：_ completion *提供自動補全建議；

Token count 類型（Token count datatype）：* token_count _ 用於統計做了標記的字段的index數目，該值會一直增加，不會因為過濾條件而減少。

mapper-murmur3 插件

類型：通過插件，可以通過 _ murmur3 _ 來計算 index 的 hash 值；
附加類型（Attachment datatype）：采用 mapper-attachments
插件，可支持_ attachments _ 索引，例如 Microsoft Office 格式，Open Document 格式，ePub, HTML 等。

"store":false		//是否單獨設置此字段的是否存儲而從_source字段中分離，默認是false，只能搜索，不能獲取值
"index": true		//分詞，不分詞是：false，設置成false，字段將不會被索引
"analyzer":"ik"	//指定分詞器,默認分詞器為standard analyzer
"boost":1.23		//字段級別的分數加權，默認值是1.0
"doc_values":false	//對not_analyzed字段，默認都是開啟，分詞字段不能使用，對排序和聚合能提升較大性能，節約內存
"fielddata":{"format":"disabled"}	//針對分詞字段，參與排序或聚合時能提高性能，不分詞字段統一建議使用doc_value
"fields":{"raw":{"type":"string","index":"not_analyzed"}} //可以對一個字段提供多種索引模式，同一個字段的值，一個分詞，一個不分詞
"ignore_above":100 //超過100個字符的文本，將會被忽略，不被索引
"include_in_all":ture	//設置是否此字段包含在_all字段中，默認是true，除非index設置成no選項
"index_options":"docs"	//4個可選參數docs（索引文檔號） ,freqs（文檔號+詞頻），positions（文檔號+詞頻+位置，通常用來距離查詢），offsets（文檔號+詞頻+位置+偏移量，通常被使用在高亮字段）分詞字段默認是position，其他的默認是docs
"norms":{"enable":true,"loading":"lazy"}	//分詞字段默認配置，不分詞字段：默認{"enable":false}，存儲長度因子和索引時boost，建議對需要參與評分字段使用 ，會額外增加內存消耗量
"null_value":"NULL"	//設置一些缺失字段的初始化值，只有string可以使用，分詞字段的null值也會被分詞
"position_increament_gap":0	//影響距離查詢或近似查詢，可以設置在多值字段的數據上火分詞字段上，查詢時可指定slop間隔，默認值是100
"search_analyzer":"ik"	//設置搜索時的分詞器，默認跟ananlyzer是一致的，比如index時用standard+ngram，搜索時用standard用來完成自動提示功能
"similarity":"BM25"	//默認是TF/IDF算法，指定一個字段評分策略，僅僅對字符串型和分詞類型有效
"term_vector":"no"	//默認不存儲向量信息，支持參數yes（term存儲），with_positions（term+位置）,with_offsets（term+偏移量），with_positions_offsets(term+位置+偏移量) 對快速高亮fast vector highlighter能提升性能，但開啟又會加大索引體積，不適合大數據量用.

映射的分類：

(1) 動態映射：

當ES在文檔中碰到一個以前沒見過的字段時，它會利用動態映射來決定該字段的類型，並自動地對該字段添加映射。

可以通過dynamic設置來控制這一行為，它能夠接受以下的選項：
true：默認值。動態添加字段
false：忽略新字段
strict：如果碰到陌生字段，拋出異常
dynamic設置可以適用在根對象上或者object類型的任意字段上。

給索引lib2創建映射類型

POST /lib2
{
    "settings":{
    
    "number_of_shards" : 3,
    
    "number_of_replicas" : 0
    
    },
    
     "mappings":{
     
      "books":{
      
        "properties":{
        
            "title":{"type":"text"},
            "name":{"type":"text","index":false},
            "publish_date":{"type":"date","index":false},
            
            "price":{"type":"double"},
            
            "number":{"type":"integer"}
        }
      }
     }
}

給索引lib2創建映射類型

POST /lib2

{

    "settings":{
    
    "number_of_shards" : 3,
    
    "number_of_replicas" : 0
    
    },
    
     "mappings":{
     
      "books":{
      
        "properties":{
        
            "title":{"type":"text"},
            "name":{"type":"text","index":false},
            "publish_date":{"type":"date","index":false},
            
            "price":{"type":"double"},
            
            "number":{
                "type":"object",
                "dynamic":true
            }
        }
      }
     }
}

基本查詢(Query查詢)

數據准備

PUT /lib3
{
    "settings":{
    "number_of_shards" : 3,
    "number_of_replicas" : 0
    },
     "mappings":{
      "user":{
        "properties":{
            "name": {"type":"text"},
            "address": {"type":"text"},
            "age": {"type":"integer"},
            "interests": {"type":"text"},
            "birthday": {"type":"date"}
        }
      }
     }
}
GET /lib3/user/_search?q=name:lisi
GET /lib3/user/_search?q=name:zhaoliu&sort=age:desc

term查詢和terms查詢

term query會去倒排索引中尋找確切的term，它並不知道分詞器的存在。這種查詢適合keyword 、numeric、date。

term: 查詢某個字段里含有某個關鍵詞的文檔

GET /lib3/user/_search/
{
  "query": {
      "term": {"interests": "changge"}
  }
}

terms: 查詢某個字段里含有多個關鍵詞的文檔

GET /lib3/user/_search
{
    "query":{
        "terms":{
            "interests": ["hejiu","changge"]
        }
    }
}

控制查詢返回的數量

from：從哪一個文檔開始
size：需要的個數

GET /lib3/user/_search
{
    "from":0,
    "size":2,
    "query":{
        "terms":{
            "interests": ["hejiu","changge"]
        }
    }
}

返回版本號

GET /lib3/user/_search
{
    "version":true,
    "query":{
        "terms":{
            "interests": ["hejiu","changge"]
        }
    }
}

match查詢

match query 知道分詞器的存在，會對filed進行分詞操作，然后再查詢

GET /lib3/user/_search
{
    "query":{
        "match":{
            "name": "zhaoliu"
        }
    }
}
GET /lib3/user/_search
{
    "query":{
        "match":{
            "age": 20
        }
    }
}

match_all:查詢所有文檔

GET /lib3/user/_search
{
  "query": {
    "match_all": {}
  }
}

multi_match:可以指定多個字段

GET /lib3/user/_search
{
    "query":{
        "multi_match": {
            "query": "lvyou",
            "fields": ["interests","name"]
         }
    }
}

match_phrase:短語匹配查詢

ElasticSearch引擎首先分析（analyze）查詢字符串，從分析后的文本中構建短語查詢，這意味着必須匹配短語中的所有分詞，並且保證各個分詞的相對位置不變：

GET lib3/user/_search
{
  "query":{  
      "match_phrase":{  
         "interests": "duanlian，shuoxiangsheng"
      }
   }
}

指定返回的字段

GET /lib3/user/\_search
{
    "_source": ["address","name"],
    "query": {
        "match": {
            "interests": "changge"
        }
    }
}

控制加載的字段

GET /lib3/user/_search
{
    "query": {
        "match_all": {}
    },
    

    "_source": {
          "includes": ["name","address"],
          "excludes": ["age","birthday"]
      }
}

使用通配符 *

GET /lib3/user/_search
{
    "_source": {
          "includes": "addr\*",
          "excludes": ["name","bir*"]
        

    },
    "query": {
        "match_all": {}
    }
}

排序

使用sort實現排序：
desc:降序，asc升序

GET /lib3/user/_search
{
		"query": {
				"match_all": {}
		},
		"sort": [
				{
					"age": {
							"order":"asc"
							}
				}
		]
}

GET /lib3/user/_search
{
    "query": {
        "match_all": {}
    },
    "sort": [
        {
           "age": {
               "order":"desc"
           }
        }
    ]
        
}

前綴匹配查詢

GET /lib3/user/_search
{
  "query": {
    "match_phrase_prefix": {
        "name": {
            "query": "zhao"
        }
    }
  }
}

范圍查詢

range:實現范圍查詢

參數：from,to,include_lower,include_upper,boost

include_lower:是否包含范圍的左邊界，默認是true

include_upper:是否包含范圍的右邊界，默認是true

GET /lib3/user/_search
{
    "query": {
        "range": {
            "birthday": {
                "from": "1990-10-10",
                "to": "2018-05-01"
            }
        }
    }
}


GET /lib3/user/_search
{
    "query": {
        "range": {
            "age": {
                "from": 20,
                "to": 25,
                "include_lower": true,
                "include_upper": false
            }
        }
    }
}

wildcard查詢

允許使用通配符 * 和 ?來進行查詢

* 代表0個或多個字符

？代表任意一個字符

GET /lib3/user/_search
{
    "query": {
        "wildcard": {
             "name": "zhao*"
        }
    }
}


GET /lib3/user/_search
{
    "query": {
        "wildcard": {
             "name": "li?i"
        }
    }
}

fuzzy實現模糊查詢

value：查詢的關鍵字

boost：查詢的權值，默認值是1.0

min_similarity:設置匹配的最小相似度，默認值為0.5，對於字符串，取值為0-1(包括0和1);對於數值，取值可能大於1;對於日期型取值為1d,1m等，1d就代表1天

prefix_length:指明區分詞項的共同前綴長度，默認是0

max_expansions:查詢中的詞項可以擴展的數目，默認可以無限大

GET /lib3/user/_search
{
    "query": {
        "fuzzy": {
             "interests": "chagge"
        }
    }
}
GET /lib3/user/_search
{
    "query": {
        "fuzzy": {
             "interests": {
                 "value": "chagge"
             }
        }
    }
}

高亮搜索結果

GET /lib3/user/_search
{
    "query":{
        "match":{
            "interests": "changge"
        }
    },
    "highlight": {
        "fields": {
             "interests": {}
        }
    }
}

Filter查詢

filter是不計算相關性的，同時可以cache。因此，filter速度要快於query。

POST /lib4/items/_bulk
{"index": {"_id": 1}}
{"price": 40,"itemID": "ID100123"}
{"index": {"_id": 2}}
{"price": 50,"itemID": "ID100124"}
{"index": {"_id": 3}}
{"price": 25,"itemID": "ID100124"}
{"index": {"_id": 4}}
{"price": 30,"itemID": "ID100125"}
{"index": {"_id": 5}}
{"price": null,"itemID": "ID100127"}

簡單的過濾查詢

GET /lib4/items/_search
{ 
       "post_filter": {
             "term": {
                 "price": 40
             }
       }
}
GET /lib4/items/_search
{
      "post_filter": {
          "terms": {
                 "price": [25,40]
              }
        }
}

GET /lib4/items/_search
{
"post_filter": {
"term": {
"itemID": "ID100123"
}
}
}


查看分詞器分析的結果：
GET /lib4/_mapping
不希望商品id字段被分詞，則重新創建映射
DELETE lib4

PUT /lib4
{
"mappings": {
"items": {
"properties": {
"itemID": {
"type": "text",
"index": false
}
}
}
}
}

bool過濾查詢

可以實現組合過濾查詢
格式：

{
"bool": {
"must": [],
"should": [],
"must_not": []
}
}

must:必須滿足的條件						---and
should：可以滿足也可以不滿足的條件--or
must_not:不需要滿足的條件				--not

GET /lib4/items/_search
{
"post_filter": {
"bool": {
"should": [
{"term": {"price":25}},
{"term": {"itemID": "id100123"}}],
"must_not": {
"term":{"price": 30}
						}
						}
         }
         }

嵌套使用bool：

GET /lib4/items/_search
{
"post_filter": {
"bool": {
"should": [
{"term": {"itemID": "id100123"}},
{
"bool": {
"must": [
{"term": {"itemID": "id100124"}},
{"term": {"price": 40}}
]
}
}
]
}
}
}

范圍過濾

gt: >
lt: <
gte: >=
lte: <=

GET /lib4/items/_search
{
"post_filter": {
"range": {
"price": {
"gt": 25,
"lt": 50
}
}
}
}

過濾非空



GET /lib4/items/_search
{
"query": {
"bool": {
"filter": {
"exists":{
"field":"price"
}
}
}
}
}
GET /lib4/items/_search
{
"query" : {
"constant_score" : {
"filter": {
"exists" : { "field" : "price" }
}
}
}
}

過濾器緩存

ElasticSearch提供了一種特殊的緩存，即過濾器緩存（filter cache），用來存儲過濾器的結果，被緩存的過濾器並不需要消耗過多的內存（因為它們只存儲了哪些文檔能與過濾器相匹配的相關信息），而且可供后續所有與之相關的查詢重復使用，從而極大地提高了查詢性能。

注意：ElasticSearch並不是默認緩存所有過濾器，
以下過濾器默認不緩存：
numeric_range
script
geo_bbox
geo_distance
geo_distance_range
geo_polygon
geo_shape
and
or
not
exists,missing,range,term,terms默認是開啟緩存的

開啟方式：在filter查詢語句后邊加上"_catch":true

聚合查詢

(1)sum
GET /lib4/items/_search
{
"size":0,
"aggs": {
"price_of_sum": {
"sum": {
"field": "price"
}
}
}
}

(2)min
GET /lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_min": {
"min": {
"field": "price"
}
}
}
}

(3)max
GET /lib4/items/_search
{
"size": 0,
"aggs": {
"price_of_max": {
"max": {
"field": "price"
}
}
}
}

(4)avg
GET /lib4/items/_search
{
"size":0,
"aggs": {
"price_of_avg": {
"avg": {
"field": "price"
}
}
}
}

(5)cardinality:求基數
GET /lib4/items/_search
{
"size":0,
"aggs": {
"price_of_cardi": {
"cardinality": {
"field": "price"
}
}
}
}

(6)terms:分組
GET /lib4/items/_search
{
"size":0,
"aggs": {
"price_group_by": {
"terms": {
"field": "price"
}
}
}
}

對那些有唱歌興趣的用戶按年齡分組

GET /lib3/user/_search
{
"query": {
"match": {
"interests": "changge"
}
},
"size": 0,
"aggs":{
"age_group_by":{
"terms": {
"field": "age",
"order": {
"avg_of_age": "desc"
}
},
"aggs": {
"avg_of_age": {
"avg": {
"field": "age"
}
}
}
}
}
}

復合查詢

將多個基本查詢組合成單一查詢的查詢

使用bool查詢

接收以下參數：

must：
文檔必須匹配這些條件才能被包含進來。

must_not：
文檔必須不匹配這些條件才能被包含進來。

should：
如果滿足這些語句中的任意語句，將增加 _score，否則，無任何影響。它們主要用於修正每個文檔的相關性得分。

filter：
必須匹配，但它以不評分、過濾模式來進行。這些語句對評分沒有貢獻，只是根據過濾標准來排除或包含文檔。

相關性得分是如何組合的。每一個子查詢都獨自地計算文檔的相關性得分。一旦他們的得分被計算出來， bool 查詢就將這些得分進行合並並且返回一個代表整個布爾操作的得分。

下面的查詢用於查找 title 字段匹配 how to make millions 並且不被標識為 spam 的文檔。那些被標識為 starred 或在2014之后的文檔，將比另外那些文檔擁有更高的排名。如果兩者都滿足，那么它排名將更高：

{
"bool": {
"must": { "match": { "title": "how to make millions" }},
"must_not": { "match": { "tag": "spam" }},
"should": [
{ "match": { "tag": "starred" }},
{ "range": { "date": { "gte": "2014-01-01" }}}
]
}
}

如果沒有 must 語句，那么至少需要能夠匹配其中的一條 should 語句。但，如果存在至少一條 must 語句，則對 should 語句的匹配沒有要求。
如果我們不想因為文檔的時間而影響得分，可以用 filter 語句來重寫前面的例子：

{
"bool": {
"must": { "match": { "title": "how to make millions" }},
"must_not": { "match": { "tag": "spam" }},
"should": [
{ "match": { "tag": "starred" }}
],
"filter": {
"range": { "date": { "gte": "2014-01-01" }}
}
}
}

通過將 range 查詢移到 filter 語句中，我們將它轉成不評分的查詢，將不再影響文檔的相關性排名。由於它現在是一個不評分的查詢，可以使用各種對 filter 查詢有效的優化手段來提升性能。

bool 查詢本身也可以被用做不評分的查詢。簡單地將它放置到 filter 語句中並在內部構建布爾邏輯：

{
"bool": {
"must": { "match": { "title": "how to make millions" }},
"must_not": { "match": { "tag": "spam" }},
"should": [
{ "match": { "tag": "starred" }}
],
"filter": {
"bool": {
"must": [
{ "range": { "date": { "gte": "2014-01-01" }}},
{ "range": { "price": { "lte": 29.99 }}}
],
"must_not": [
{ "term": { "category": "ebooks" }}
]
}
}
}
}

constant_score查詢

它將一個不變的常量評分應用於所有匹配的文檔。它被經常用於你只需要執行一個 filter 而沒有其它查詢（例如，評分查詢）的情況下。

{
"constant_score": {
"filter": {
"term": { "category": "ebooks" }
}
}
}

term 查詢被放置在 constant_score 中，轉成不評分的filter。這種方式可以用來取代只有 filter 語句的 bool 查詢。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 ELK查詢命令詳解總結 yum命令查詢詳解 Mongodb查詢命令詳解【ELK】7. elasticsearch linux上操作es命令詳解 CentOS 使用yum安裝ELK環境命令詳解 netstat 網絡查詢命令詳解 ELK查詢和匯總 elk之查詢方式（4種） ELK 聚合查詢 ELK日志查詢系統