Kibana基礎之直接操作ElasticSearch

本文轉載自查看原文 2020-02-27 16:35 1729

1.入門級別操作

Elasticsearch采用Rest風格API，其API就是一次http請求，你可以用任何工具發起http請求

創建索引的請求格式：

請求方式：PUT
請求路徑：/索引庫名
請求參數：json格式：
```
{
    "settings": {
        "屬性名": "屬性值"
     }
}
```
settings：就是索引庫設置，其中可以定義索引庫的各種屬性，目前我們可以不設置，都走默認。

可以看到索引創建成功了。

使用kibana創建索引庫

kibana的控制台，可以對http請求進行簡化，示例：

創建索引庫

PUT /索引庫名

省去了elasticsearch的服務器地址，而且還有語法提示，非常舒服。

查看索引庫

語法

Get請求可以幫我們查看索引信息，格式：

GET /索引庫名

刪除索引使用DELETE請求

DELETE /索引庫名

示例

再次查看hema2：

RestApi 類型及映射操作

有了索引庫，等於有了數據庫中的database。接下來就需要索引庫中的類型了，也就是數據庫中的表。創建數據庫表需要設置字段約束，索引庫也一樣，在創建索引庫的類型時，需要知道這個類型下有哪些字段，每個字段有哪些約束信息，這就叫做字段映射(mapping)

字段的約束我們在學習Lucene中我們都見到過，包括到不限於：

字段的數據類型
是否要存儲
是否要索引
是否分詞
分詞器是什么

我們一起來看下創建的語法。

創建字段映射

語法

請求方式依然是PUT

PUT /索引庫名/_mapping/類型名稱
{
  "properties": {
    "字段名": {
      "type": "類型",
      "index": true，
      "store": true，
      "analyzer": "分詞器"
    }
  }
}

類型名稱：就是前面將的type的概念，類似於數據庫中的表字段名：任意填寫，下面指定許多屬性，例如：默認值可以通過查看@Field注解
- type：類型，可以是text、long、short、date、integer、object等
- index：是否索引，默認為true
- store：是否存儲，默認為false
- analyzer：分詞器，這里的ik_max_word即使用ik分詞器

發起請求：

PUT hema/_mapping/goods
{
  "properties": {
    "title": {
      "type": "text",
      "analyzer": "ik_max_word"
   },
    "images": {
      "type": "keyword",
      "index": "false"
   },
    "price": {
      "type": "float"
   }
 }
}

響應結果：

{
  "acknowledged": true
}

上述案例中，就給heima這個索引庫添加了一個名為goods的類型，並且在類型中設置了3個字段：

title：商品標題
images：商品圖片
price：商品價格

查看映射關系

語法：

GET /索引庫名/_mapping

2. 進階級別操作（操作一套完成文檔的crud）

RestApi 文檔操作

文檔，即索引庫中某個類型下的數據，會根據規則創建索引，將來用來搜索。可以類比做數據庫中的每一行數據。

2.1新增文檔

2.1.1新增並隨機生成id

通過POST請求，可以向一個已經存在的索引庫中添加文檔數據。

語法：

POST /索引庫名/類型名
{
    "key":"value"
}

示例：

POST /hema/goods/
{
    "title":"小米手機",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":2699.00
}

響應：

{
  "_index": "hema",
  "_type": "goods",
  "_id": "r9c1KGMBIhaxtY5rlRKv",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 3,
    "successful": 1,
    "failed": 0
 },
  "_seq_no": 0,
  "_primary_term": 2
}

可以看到結果顯示為：created，應該是創建成功了。 AWwQDNsJ4VusCdwhuQD7

另外，需要注意的是，在響應結果中有個_id字段，這個就是這條文檔數據的唯一標示，以后的增刪改查都依賴這個id作為唯一標示。

可以看到id的值為：r9c1KGMBIhaxtY5rlRKv,這里我們新增時沒有指定id，所以是ES幫我們隨機生成的id。

2.1.2新增文檔並自定義id

如果我們想要自己新增的時候指定id，可以這么做：

POST /索引庫名/類型/id值
{
    ...
}

示例：

POST /hema/goods/2
{
    "title":"大米手機",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":2899.00
}

得到的數據：

{
  "_index": "hema",
  "_type": "goods",
  "_id": "2",
  "_score": 1,
  "_source": {
    "title": "大米手機",
    "images": "http://image.leyou.com/12479122.jpg",
    "price": 2899
 }
}

2.2查看文檔

根據rest風格，新增是post，查詢應該是get，不過查詢一般都需要條件，這里我們把剛剛生成數據的id帶上。

通過kibana查看數據：

GET /heima/goods/r9c1KGMBIhaxtY5rlRKv

查看結果：

{
  "_index": "hema",
  "_type": "goods",
  "_id": "r9c1KGMBIhaxtY5rlRKv",
  "_version": 1,
  "found": true,
  "_source": {
    "title": "小米手機",
    "images": "http://image.leyou.com/12479122.jpg",
    "price": 2699
 }
}

_source：源文檔信息，所有的數據都在里面。
_id：這條文檔的唯一標示

2.3修改數據

把剛才新增的請求方式改為PUT，就是修改了。不過修改必須指定id，

id對應文檔存在，則修改
id對應文檔不存在，則新增

比如，我們把使用id為3，不存在，則應該是新增：

PUT /hema/goods/3
{
    "title":"超米手機",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":3899.00
}

結果：

{
  "_index": "hema",
  "_type": "goods",
  "_id": "3",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
 },
  "_seq_no": 1,
  "_primary_term": 1
}

可以看到是created，是新增。

我們再次執行剛才的請求，不過把數據改一下：

PUT /hema/goods/3
{
    "title":"超大米手機",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":3299.00,
    "stock": 100,
    "saleable":true
}

查看結果：

{
  "_index": "hema",
  "_type": "goods",
  "_id": "3",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
 },
  "_seq_no": 2,
  "_primary_term": 1
}

可以看到結果是：updated，顯然是更新數據

2.4刪除數據

刪除使用DELETE請求，同樣，需要根據id進行刪除：

語法

DELETE /索引庫名/類型名/id值

示例：

3.查詢

我們從4塊來講查詢：

基本查詢
_source過濾
結果過濾
高級查詢
排序

5.1.基本查詢：

基本語法

GET /索引庫名/_search
{
    "query":{
        "查詢類型":{
            "查詢條件":"查詢條件值"
       }
   }
}

這里的query代表一個查詢對象，里面可以有不同的查詢屬性

查詢類型：
- 例如：match_all， match，term ， range 等等
查詢條件：查詢條件會根據類型的不同，寫法也有差異，后面詳細講解

5.1.1 查詢所有（match_all)

示例：

GET /hema/_search
{
    "query":{
        "match_all": {}
   }
}

query：代表查詢對象
match_all：代表查詢所有

結果：

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
 },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
     {
        "_index": "hema",
        "_type": "goods",
        "_id": "2",
        "_score": 1,
        "_source": {
          "title": "大米手機",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2899
       }
     },
     {
        "_index": "hema",
        "_type": "goods",
        "_id": "r9c1KGMBIhaxtY5rlRKv",
        "_score": 1,
        "_source": {
          "title": "小米手機",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2699
       }
     }
   ]
 }
}

took：查詢花費時間，單位是毫秒
time_out：是否超時
_shards：分片信息
hits：搜索結果總覽對象
- total：搜索到的總條數
- max_score：所有結果中文檔得分的最高分
- hits：搜索結果的文檔對象數組，每個元素是一條搜索到的文檔信息
  - _index：索引庫
  - _type：文檔類型
  - _id：文檔id
  - _score：文檔得分
  - _source：文檔的源數據

5.1.2 匹配查詢（match）

我們先加入一條數據，便於測試：

PUT /hema/goods/3
{
    "title":"小米電視4A",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":3899.00
}

現在，索引庫中有2部手機，1台電視：

or關系

match類型查詢，會把查詢條件進行分詞，然后進行查詢,多個詞條之間是or的關系

GET /hema/_search
{
    "query":{
        "match":{
            "title":"小米電視"
       }
   }
}

結果：

在上面的案例中，不僅會查詢到電視，而且與小米相關的都會查詢到，多個詞之間是or的關系。

and關系

某些情況下，我們需要更精確查找，我們希望這個關系變成and，可以這樣做：

GET hema/goods/_search
{
    "query":{
        "match":{
            "title":{"query":"小米電視","operator":"and"} 
       }
   }
}

結果：

本例中，只有同時包含小米和電視的詞條才會被搜索到。

5.1.3 詞條匹配(term)

term 查詢被用於精確值匹配，這些精確值可能是數字、時間、布爾或者那些未分詞的字符串

GET /hema/_search
{
    "query":{
        "term":{
            "price":2699.00
       }
   }
}

結果：

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
 },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
     {
        "_index": "hema",
        "_type": "goods",
        "_id": "r9c1KGMBIhaxtY5rlRKv",
        "_score": 1,
        "_source": {
          "title": "小米手機",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2699
       }
     }
   ]
 }
}

5.1.4 布爾組合（bool)

bool把各種其它查詢通過must（與）、must_not（非）、should（或）的方式進行組合

GET /hema/_search
{
    "query":{
        "bool":{
        "must":     { "match": { "title": "大米" }},
        "must_not": { "match": { "title":  "電視" }},
        "should":   { "match": { "title": "手機" }}
       }
   }
}

結果：

{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
 },
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
     {
        "_index": "hema",
        "_type": "goods",
        "_id": "2",
        "_score": 0.5753642,
        "_source": {
          "title": "大米手機",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 2899
       }
     }
   ]
 }
}

5.1.5 范圍查詢(range)

range 查詢找出那些落在指定區間內的數字或者時間

GET /hema/_search
{
    "query":{
        "range": {
            "price": {
                "gte":  1000.0,
                "lt":   2800.00
           }
    }
   }
}

range查詢允許以下字符：

操作符	說明
gt	大於
gte	大於等於
lt	小於
lte	小於等於

5.2.結果過濾

默認情況下，elasticsearch在搜索的結果中，會把文檔中保存在_source的所有字段都返回。

如果我們只想獲取其中的部分字段，我們可以添加_source的過濾

5.2.1.直接指定字段

示例：

GET /hema/_search
{
  "_source": ["title","price"],
  "query": {
    "term": {
      "price": 2699
   }
 }
}

返回的結果：

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
 },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
     {
        "_index": "hema",
        "_type": "goods",
        "_id": "r9c1KGMBIhaxtY5rlRKv",
        "_score": 1,
        "_source": {
          "price": 2699,
          "title": "小米手機"
       }
     }
   ]
 }
}

5.2.2.指定includes和excludes

我們也可以通過：

includes：來指定想要顯示的字段
excludes：來指定不想要顯示的字段

二者都是可選的。

示例：

GET /hema/_search
{
  "_source": {
    "includes":["title","price"]
 },
  "query": {
    "term": {
      "price": 2699
   }
 }
}

與下面的結果將是一樣的：

GET /hema/_search
{
  "_source": {
     "excludes": ["images"]
 },
  "query": {
    "term": {
      "price": 2699
   }
 }
}

5.3 過濾(filter)

條件查詢中進行過濾

所有的查詢都會影響到文檔的評分及排名。如果我們需要在查詢結果中進行過濾，並且不希望過濾條件影響評分，那么就不要把過濾條件作為查詢條件來用。而是使用filter方式：

GET /hema/_search
{
    "query":{
        "bool":{
        "must":{ "match": { "title": "小米手機" }},
        "filter":{
                "range":{"price":{"gt":2000.00,"lt":3800.00}}
        }
       }
   }
}

5.4 排序

5.5.1 單字段排序

sort 可以讓我們按照不同的字段進行排序，並且通過order指定排序的方式

GET /hema/_search
{
  "query": {
    "match": {
      "title": "小米手機"
   }
 },
  "sort": [
    {"price": { "order": "desc"}
   }
 ]
}

5.5.分頁

elasticsearch的分頁與mysql數據庫非常相似，都是指定兩個值：

from：開始位置從0開始
size：每頁大小

GET /hema/_search
{
  "query": {
    "match_all": {}
 },
  "sort": [
   {
      "price": {
        "order": "asc"
     }
   }
 ],
  "from": 3,
  "size": 3
}

5.6.高亮

高亮原理：

服務端搜索數據，得到搜索結果
把搜索結果中，搜索關鍵字都加上約定好的標簽
前端頁面提前寫好標簽的CSS樣式，即可高亮

elasticsearch中實現高亮的語法比較簡單：

GET /hema/_search
{
  "query": {
    "match": {
      "title": "手機"
   }
 },
  "highlight": {
    "pre_tags": "<em>",
    "post_tags": "</em>", 
    "fields": {
      "title": {}
   }
 }
}

在使用match查詢的同時，加上一個highlight屬性：

pre_tags：前置標簽
post_tags：后置標簽
fields：需要高亮的字段
- title：這里聲明title字段需要高亮，后面可以為這個字段設置特有配置，也可以空

6. 聚合aggregations

聚合可以讓我們極其方便的實現對數據的統計、分析。例如：

什么品牌的手機最受歡迎？
這些手機的平均價格、最高價格、最低價格？
這些手機每月的銷售情況如何？

實現這些統計功能的比數據庫的sql要方便的多，而且查詢速度非常快，可以實現近實時搜索效果。

6.1 基本概念

Elasticsearch中的聚合，包含多種類型，最常用的兩種，一個叫桶，一個叫度量：

桶（bucket）

桶的作用，是按照某種方式對數據進行分組，每一組數據在ES中稱為一個桶，例如我們根據國籍對人划分，可以得到中國桶、英國桶，日本桶……或者我們按照年齡段對人進行划分：0~10,10~20,20~30,30~40等。

Elasticsearch中提供的划分桶的方式有很多：

Date Histogram Aggregation：根據日期階梯分組，例如給定階梯為周，會自動每周分為一組
Histogram Aggregation：根據數值階梯分組，與日期類似，需要知道分組的間隔（interval）
Terms Aggregation：根據詞條內容分組，詞條內容完全匹配的為一組
Range Aggregation：數值和日期的范圍分組，指定開始和結束，然后按段分組
……

綜上所述，我們發現bucket aggregations 只負責對數據進行分組，並不進行計算，因此往往bucket中往往會嵌套另一種聚合：metrics aggregations即度量

度量（metrics）

分組完成以后，我們一般會對組中的數據進行聚合運算，例如求平均值、最大、最小、求和等，這些在ES中稱為度量

比較常用的一些度量聚合方式：

Avg Aggregation：求平均值
Max Aggregation：求最大值
Min Aggregation：求最小值
Percentiles Aggregation：求百分比
Stats Aggregation：同時返回avg、max、min、sum、count等
Sum Aggregation：求和
Top hits Aggregation：求前幾
Value Count Aggregation：求總數
……

為了測試聚合，我們先批量導入一些數據

創建索引：

PUT /car
{
  "mappings": {
    "orders": {
      "properties": {
        "color": {
          "type": "keyword"
       },
        "make": {
          "type": "keyword"
       }
     }
   }
 }
}

注意：在ES中，需要進行聚合、排序、過濾的字段其處理方式比較特殊，因此不能被分詞，必須使用keyword或數值類型。這里我們將color和make這兩個文字類型的字段設置為keyword類型，這個類型不會被分詞，將來就可以參與聚合

導入數據，這里是采用批處理的API，大家直接復制到kibana運行即可：

_bulk批量操作

POST /car/orders/_bulk
{ "index": {"_id":"1"}}
{ "price" : 10000, "color" : "紅", "make" : "本田", "sold" : "2014-10-28" }
{ "index": {"_id":"2"}}
{ "price" : 20000, "color" : "紅", "make" : "本田", "sold" : "2014-11-05" }
{ "index": {"_id":"3"}}
{ "price" : 30000, "color" : "綠", "make" : "福特", "sold" : "2014-05-18" }
{ "index": {"_id":"4"}}
{ "price" : 15000, "color" : "藍", "make" : "豐田", "sold" : "2014-07-02" }
{ "index": {"_id":"5"}}
{ "price" : 12000, "color" : "綠", "make" : "豐田", "sold" : "2014-08-19" }
{ "index": {"_id":"6"}}
{ "price" : 20000, "color" : "紅", "make" : "本田", "sold" : "2014-11-05" }
{ "index": {"_id":"7"}}
{ "price" : 80000, "color" : "紅", "make" : "寶馬", "sold" : "2014-01-01" }
{ "index": {"_id":"8"}}
{ "price" : 25000, "color" : "藍", "make" : "福特", "sold" : "2014-02-12" }

{ "index" : { "_index" : "索引名", "_type" : "類名", "_id" : "1" } }
可以忽略為{ "index": {}}

6.2 聚合為桶

首先，我們按照汽車的顏色color來划分桶，按照顏色分桶，最好是使用TermAggregation類型，按照顏色的名稱來分桶。

GET /car/_search
{
    "size" : 0,
    "aggs" : { 
        "popular_colors" : { 
            "terms" : { 
              "field" : "color"
           }
       }
   }
}

size：查詢條數，這里設置為0，因為我們不關心搜索到的數據，只關心聚合結果，提高效率
aggs：聲明這是一個聚合查詢，是aggregations的縮寫
- popular_colors：給這次聚合起一個名字，可任意指定。
  - terms：聚合的類型，這里選擇terms，是根據詞條內容（這里是顏色）划分
    - field：划分桶時依賴的字段

結果：

{
  "took": 33,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
 },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
 },
  "aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
       {
          "key": "紅",
          "doc_count": 4
       },
       {
          "key": "綠",
          "doc_count": 2
       },
       {
          "key": "藍",
          "doc_count": 2
       }
     ]
   }
 }
}

hits：查詢結果為空，因為我們設置了size為0
aggregations：聚合的結果
popular_colors：我們定義的聚合名稱
buckets：查找到的桶，每個不同的color字段值都會形成一個桶
- key：這個桶對應的color字段的值
- doc_count：這個桶中的文檔數量

通過聚合的結果我們發現，目前紅色的小車比較暢銷！

6.3 桶內度量

前面的例子告訴我們每個桶里面的文檔數量，這很有用。但通常，我們的應用需要提供更復雜的文檔度量。例如，每種顏色汽車的平均價格是多少？

因此，我們需要告訴Elasticsearch使用哪個字段，使用何種度量方式進行運算，這些信息要嵌套在桶內，度量的運算會基於桶內的文檔進行

現在，我們為剛剛的聚合結果添加求價格平均值的度量：

GET /car/_search
{
    "size" : 0,
    "aggs" : { 
        "popular_colors" : { 
            "terms" : { 
              "field" : "color"
           },
            "aggs":{
                "avg_price": { 
                   "avg": {
                      "field": "price" 
                   }
               }
           }
       }
   }
}

aggs：我們在上一個aggs(popular_colors)中添加新的aggs。可見度量也是一個聚合
avg_price：聚合的名稱自定義
avg：度量的類型，這里是求平均值
field：度量運算的字段

結果：

{
  "took": 23,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
 },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
 },
  "aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
       {
          "key": "紅",
          "doc_count": 4,
          "avg_price": {
            "value": 32500
         }
       },
       {
          "key": "綠",
          "doc_count": 2,
          "avg_price": {
            "value": 21000
         }
       },
       {
          "key": "藍",
          "doc_count": 2,
          "avg_price": {
            "value": 20000
         }
       }
     ]
   }
 }
}

可以看到每個桶中都有自己的avg_price字段，這是度量聚合的結果

6.4 桶內嵌套桶

剛剛的案例中，我們在桶內嵌套度量運算。事實上桶不僅可以嵌套運算，還可以再嵌套其它桶。也就是說在每個分組中，再分更多組。

比如：我們想統計每種顏色的汽車中，分別屬於哪個制造商，按照make字段再進行分桶

GET /car/_search
{
    "size" : 0,
    "aggs" : { 
        "popular_colors" : { 
            "terms" : { 
              "field" : "color"
           },
            "aggs":{
                "avg_price": { 
                   "avg": {
                      "field": "price" 
                   }
               },
                "maker":{
                    "terms":{
                        "field":"make"
                   }
               }
           }
       }
   }
}

原來的color桶和avg計算我們不變
maker：在嵌套的aggs下新添一個桶，叫做maker 自定義
terms：桶的划分類型依然是詞條
filed：這里根據make字段進行划分

//分組顏色后，分組make，再求make的平均值
GET /car/_search
{
 "size": 0,
 "aggs": {
  "popular_colors": {
   "terms": {
    "field": "color"
   },
   "aggs": {
    "maker": {
     "terms": {
      "field": "make"
     },
     "aggs": {
      "avg_price": {
       "avg": {
        "field": "price"
       }
      }
     }
    }
   }
  }
 }
}

6.5.划分桶的其它方式

前面講了，划分桶的方式有很多，例如：

Date Histogram Aggregation：根據日期階梯分組，例如給定階梯為周，會自動每周分為一組
Histogram Aggregation：根據數值階梯分組，與日期類似
Terms Aggregation：根據詞條內容分組，詞條內容完全匹配的為一組
Range Aggregation：數值和日期的范圍分組，指定開始和結束，然后按段分組

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 使用kibana操作elasticsearch（es） ElasticSearch+Kibana 索引操作( 附源碼) ElasticSearch+Kibana(7.13.1) 簡單操作 kibana7.x安裝配置操作elasticsearch elasticsearch 備份與恢復詳細步驟（包含kibana操作界面）操作ElasticSearch插件和可視化工具 Kibana Elasticsearch索引的操作，利用kibana(如何創建/刪除一個es的索引？) elasticsearch 7版本基礎操作 ELK(elasticsearch+kibana+logstash)搜索引擎(二)： elasticsearch基礎教程 Docker安裝ElasticSearch及kibana