elasticSearch基本使用

本文轉載自查看原文 2018-04-24 00:07 5723 ElasticSearch系列

1. elasticsearch 命令的基本格式

RESTful接口URL的格式：

http://localhost:9200/<index>/<type>/[<id>]

其中index、type是必須提供的。id是可選的，不提供es會自動生成。index、type將信息進行分層，利於管理。index可以理解為數據庫；type理解為數據表；id相當於數據庫表中記錄的主鍵，是唯一的。

注：在url網址后面加"?pretty"，會讓返回結果以工整的方式展示出來，適用所有操作數據類的url。"?"表示引出條件，"pretty"是條件內容。

2. elasticsearch基本的增刪改

2.1 elasticSearch增加

向store索引中添加一些書籍

curl -H "Content-Type: application/json" -XPUT 'http://192.168.187.201:9200/store/books/1?pretty' -d '{
  "title": "Elasticsearch: The Definitive Guide",
  "name" : {
    "first" : "Zachary",
    "last" : "Tong"
  },
  "publish_date":"2015-02-06",
  "price":"49.99"
}'

注：curl是linux下的http請求，-H "Content-Type: application/json"需要添加，否則會報錯{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}

加"pretty"

不加"pretty"

加"pretty"與不加"pretty"的區別就是返回結果工整與不工整的差別，其他操作類似。為了使返回結果工整，以下操作都在url后添加"pretty"

2.2 elasticSearch刪除

刪除一個文檔

curl -XDELETE 'http://hadoop1:9200/store/books/1?pretty'

2.3 elasticSearch更新(改)

①可以通過覆蓋的方式更新

curl -H "Content-Type:application/json" -XPUT 'http://hadoop1:9200/store/books/1?pretty' -d '{
  "title": "Elasticsearch: The Definitive Guide",
  "name" : {
    "first" : "Zachary",
    "last" : "Tong"
  },
  "publish_date":"2016-02-06",
  "price":"99.99"
}'

② 通過_update API的方式單獨更新你想要更新的

curl -H "Content-Type: application/json" -XPOST 'http://hadoop1:9200/store/books/1/_update?pretty' -d '{
  "doc": {
     "price" : 88.88
  }
}'

3. elasticSearch查詢

elasticSearch查詢分三種，一是瀏覽器查詢，二是curl查詢，三是請求體查詢GET或POS。

注：采用_search的模糊查詢(包括bool過濾查詢、嵌套查詢、range范圍過濾查詢等等)，url可以不必指定type，只用指定index查詢就行，具體例子看"2.1.4 elasticSearch查詢 ③query基本匹配查詢"節點的具體查詢實例

3.1 瀏覽器查詢

通過瀏覽器IP+網址查詢

http://hadoop1:9200/store/books/1?pretty

3.2 在linux通過curl的方式查詢

3.2.1 通過ID獲得文檔信息

 curl -XGET 'http://hadoop1:9200/store/books/1?pretty'

3.2.2 通過_source獲取指定的字段

curl -XGET 'http://hadoop1:9200/store/books/1?_source=title&pretty'
curl -XGET 'http://hadoop1:9200/store/books/1?_source=title,price&pretty'
curl -XGET 'http://hadoop1:9200/store/books/1?_source&pretty'

3.2.3 query基本匹配查詢

查詢數據前，可以批量導入1000條數據集到elasticsearch里，具體參考"4 elasticSearch批處理命令 4.1 導入數據集"節點，以便數據查詢方便。

① "q=*"表示匹配索引中所有的數據，一般默認只返回前10條數據。

curl 'hadoop1:9200/bank/_search?q=*&pretty'

#等價於:
curl -H "Content-Type:applicatin/json" -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
　　 "query": { "match_all": {} }
}'

② 匹配所有數據，但只返回1個

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
   "query": {"match_all": {}},
   "size": 1
}'

注：如果size不指定，則默認返回10條數據。

③ 返回從11到20的數據(索引下標從0開始)

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
   "query": {“match_all”: {}},
   "from": 10,
   "size": 10
}

④ 匹配所有的索引中的數據，按照balance字段降序排序，並且返回前10條(如果不指定size，默認最多返回10條)

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
   "query": {"match_all": {}},
   "sort": {"balance":{"order": "desc"}}
}'

⑤ 返回特定的字段(account_number balance) ，與②通過_source獲取指定的字段類似

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
   "query": {"match_all": {}},
   "_source": ["account_number", "balance"]
}'

⑥ 返回account_humber為20的數據

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
   "query": {"match": {"account_number":20}}
}'

⑦ 返回address中包含mill的所有數據

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
     "query": {"match":{"address": "mill"}}
}'

⑧ 返回地址中包含mill或者lane的所有數據

curl -H "Content_Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
    “query": {"match": {"address": "mill lane"}}
}'

⑨ 與第8不同，多匹配(match_phrase是短語匹配)，返回地址中包含短語"mill lane"的所有數據

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
    "query": {"match_phrase": {"address": "mill lane"}}
}'

3.2.4 bool過濾查詢，可以做組合過濾查詢、嵌套查詢等

SELECT * FROM books WHERE (price = 35.99 OR price = 99.99) AND (publish_date != "2016-02-06")

類似的，Elasticsearch也有 and, or, not這樣的組合條件的查詢方式，格式如下：

{
      ”bool“ : {
             "filter":   [],
             "must" :    [],
             "should":   [],
             "must_not": []
       }
}

說明：
filter：過濾
must：條件必須滿足，相當於and
should：條件可以滿足也可以不滿足，相當於or
must_not：條件不需要滿足，相當於not

3.2.4.1 filter查詢

①filter指定單個值

# SELECT * FROM books WHERE price = 35.99
# filtered 查詢價格是35.99的
curl -H "Content-Type:application/json" -XGET 'http://hadoop1:9200/store/books/_search?pretty' -d '{
    "query" : {
        "bool" : {
            "must" : {
                "match_all" : {}
            },
            "filter" : {
                "term" : {
                    "price" : 35.99
                  }
             }
        }
    }
}'

注：帶有key-value鍵值對的都需要加 -H “Content-Type: application/json”

②filter指定多個值

curl -XGET 'http://hadoop1:9200/store/books/_search?pretty' -d '{
       "query" : {
             "bool" : {
                  "filter" : {
                       "terms" : {
                              "price" : [35.99, 99.99]
                        }
                  }
             }
        }
}'

3.2.4.2 must、should、must_not查詢

①must、should、must_not與term結合使用：

curl -H "Content-Type:application/json" -XGET 'http://hadoop1:9200/store/books/_search?pretty' -d '{
    "query" : {
         "bool" : {
              "should" : [
                   { "term" : {"price" : 35.99}},
                   { "term" : {"price" : 99.99}}
               ],
               "must_not" : {
                     "term" : {"publish_date" : "2016-06-06"}
                }
         }
    }
}'

②must、should、must_not與match結合使用

bool表示查詢列表中只要有任何一個為真則認為匹配：

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
    "query": {
         "bool": {
              "must_not": [
                    {"match": {"address": "mill"}},
                    {"match": {"address": "lane"}}
               ]
         }
     }
}'

返回age年齡大於40歲、state不是ID的所有數據：

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
     "query": {
          "bool": {
               "must": [
                     {"match": {"age": "40"}}
                ],
                "must_not": [
                      {"match": {"state": "ID"}}
                ]
          }
     }
}'

3.2.4.3 bool嵌套查詢

# 嵌套查詢
# SELECT * FROM books WHERE price = 35.99 OR ( publish_date = "2016-02-06" AND price = 99.99 )
curl -H "Content-Type:application/json" -XGET 'http://hadoop1:9200/store/books/_search?pretty' -d '{
    "query" : {
         "bool" : {
              "should" : [
                   { "term" : {"price" : 35.99 }},
                   { "bool" : {
                           "must" : [
                                { "term" : {"publish_date" : "2016-06-06"}},
                                { "term" : {"price" : 99.99}}
                            ]
                       }
                    }
              ]
         }            
    }
}'

3.2.4.4 filter的range范圍過濾查詢

第一個示例，查找price價錢大於20的數據：

# SELECT * FROM books WHERE price >= 20 AND price < 100
# gt :  > 大於
# lt :  < 小於
# gte :  >= 大於等於
# lte :  <= 小於等於

curl -H "Content-Type:application/json" -XGET 'http://hadoop1:9200/store/books/_search?pretty' -d '{
    "query" : {
         "bool" : {
              "filter" : {
                   "range" : {
                        "price" : {
                             "gt" : 20.0,
                             "boost" : 4.0
                         }
                   }
              }
         }
    }
}'

注：boost：設置boost查詢的值，默認1.0

第二個示例，使用布爾查詢返回balance在20000到30000之間的所有數據：

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
     "query": {
         "bool": {
             "must": {"match_all": {}},
             "filter": {
                   "range": {
                         "balance": {
                              "gte": 20000,
                               "lte": 30000
                         }
                    }
             }
         }
     }
}'

3.2.4 elasticSearch聚合查詢

第一個示例，將所有的數據按照state分組（group），然后按照分組記錄數從大到小排序（默認降序），返回前十條（默認）

curl -H "Content-Type:application/json" -XPOST  'hadoop1:9200/bank/_search?pretty' -d '{
   "size": 0,
   "aggs": {
       "group_by_state": {
            "terms": {
                 "field": "state"
            }
       }
   }
}'

可能遇到的問題：elasticsearch 進行排序的時候，我們一般都會排序數字、日期，而文本排序則會報錯：Fielddata is disabled on text fields by default. Set fielddata=true on [state] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.如：

解決方案：5.x后對排序，聚合這些操作，用單獨的數據結構(fielddata)緩存到內存里了，需要單獨開啟，官方解釋在此fielddata。聚合前執行如下操作，用以開啟fielddata：

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_mapping/account?pretty' -d '{
"properties": {
    "state": {
        "type": "text",
        "fielddata": true
     }                           
 }           
}'

說明：bank為index，_mapping為映射，account為type，這三個要素為必須，”state“為聚合"group_by_state"操作的對象字段

聚合查詢成功示例：

第二個示例，將所有的數據按照state分組（group），降序排序，計算每組balance的平均值並返回（默認）

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/_search?pretty' -d '{
    "size": 0,
    "aggs": {
        "group_by_state": {
             "terms": {
                  "field": "state"
             },
              "aggs": {
                   "average_balance": {
                        "avg": {
                              "field":"balance"
                         }
                    } 
              }
        }
    }
}'

4. elasticSearch批處理命令

4.1 導入數據集

你可以點擊這里下載示例數據集：accounts.json

導入示例數據集：

curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/account/_bulk?pretty' --data-binary "@accounts.json"
curl -H "Content-Type:application/json" -XPOST 'hadoop1:9200/bank/account/_bulk?pretty' --data-binary "@/home/hadoop/accounts.json"