Elasticsearch中文文檔,內容不全


注意

內容不全,這是觀看中文文檔進行操作的

文檔地址

舊版中文文檔,部分內容過期 https://www.elastic.co/guide/cn/elasticsearch/guide/current/index.html

1.0.0 基礎入門

1.1.0 你知道的,為了搜索...

1.1.1 索引員工文檔

  • megacorp 索引名稱 -> 數據庫
  • employee 類型名稱 -> 表
  • 1 特定雇員的ID -> 主鍵
  • 請求體 JSON文檔 -> 行
PUT /megacorp/employee/1
{
    "first_name" : "John",
    "last_name" :  "Smith",
    "age" :        25,
    "about" :      "I love to go rock climbing",
    "interests": [ "sports", "music" ]
}
PUT /megacorp/employee/2
{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}
PUT /megacorp/employee/3
{
    "first_name" :  "Douglas",
    "last_name" :   "Fir",
    "age" :         35,
    "about":        "I like to build cabinets",
    "interests":  [ "forestry" ]
}

執行聚合"aggs"需要設置"fielddata":true

PUT megacorp/employee/_mapping
{
  "properties": {
    "interests":{
      "type": "text", 
      "fielddata": true
    }
  }
}

1.1.2 檢索文檔

根據索引 類型 id查詢指定文檔

GET /megacorp/employee/1

結果

{
  "_index": "megacorp",
  "_type": "employee",
  "_id": "1",
  "_version": 4,
  "found": true,
  "_source": {//原始JSON文檔
    "first_name": "John",
    "last_name": "Smith",
    "age": 25,
    "about": "I love to go rock climbing",
    "interests": [
      "sports",
      "music"
    ]
  }
}

1.1.3 輕量搜索

根據索引 類型查詢全部文檔

GET /megacorp/employee/_search

結果

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,//匹配到3條
    "max_score": 1,
    "hits": [//顯示匹配的記錄
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 1,
        "_source": {
          "first_name": "Jane",
          "last_name": "Smith",
          "age": 32,
          "about": "I like to collect rock albums",
          "interests": [
            "music"
          ]
        }
      },
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 1,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "3",
        "_score": 1,
        "_source": {
          "first_name": "Douglas",
          "last_name": "Fir",
          "age": 35,
          "about": "I like to build cabinets",
          "interests": [
            "forestry"
          ]
        }
      }
    ]
  }
}

1.1.4 使用查詢表達式搜索

查詢 last_name=smith

GET /megacorp/employee/_search
{
    "query" : {
        "match" : {"last_name" : "Smith"}
    }
}

結果

...
  "hits": {
    "total": 2,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 0.2876821,
        "_source": {
          "first_name": "Jane",
          "last_name": "Smith",
          "age": 32,
          "about": "I like to collect rock albums",
          "interests": [
            "music"
          ]
        }
      },
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
...

1.1.5 更復雜的搜索

查詢 last_name=smith and age>30

GET /megacorp/employee/_search
{
    "query" : {
        "bool": {
            "must": {
                "match" : {
                    "last_name" : "smith" 
                }
            },
            "filter": {
                "range" : {
                    "age" : { "gt" : 30 } 
                }
            }
        }
    }
}

結果

...
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 0.2876821,
        "_source": {
          "first_name": "Jane",
          "last_name": "Smith",
          "age": 32,
          "about": "I like to collect rock albums",
          "interests": [
            "music"
          ]
        }
      }
    ]
  }
...

1.1.6 全文檢索

查詢about中含有詞rock climbing
根據匹配得分_score進行排序

GET /megacorp/employee/_search
{
    "query" : {
        "match" : {"about" : "rock climbing"}
    }
}

結果

...
  "hits": {
    "total": 2,//匹配2條
    "max_score": 0.5753642,//最大得分
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 0.2876821,//得分低 因為"about"只包含了"rock"
        "_source": {
          "first_name": "Jane",
          "last_name": "Smith",
          "age": 32,
          "about": "I like to collect rock albums",
          "interests": [
            "music"
          ]
        }
      }
    ]
  }
...

1.1.7 短語搜索

僅匹配"about"中含有"rock climbing"短語
根據匹配得分"_score"進行排序

GET /megacorp/employee/_search
{
    "query" : {
        "match_phrase" : {"about" : "rock climbing"}
    }
}

結果

...
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
...

1.1.8 高亮搜索

根據匹配得分"_score"進行排序
在"highlight"中使用"em"標簽封裝了"about"中匹配到的詞

GET /megacorp/employee/_search
{
    "query" : {
        "match_phrase" : {"about" : "rock climbing"}
    },
    "highlight": {
        "fields" : {"about" : {}}
    }
}

結果

...
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        },
        "highlight": {//匹配到的使用了"em"標簽封裝
          "about": [
            "I love to go <em>rock</em> <em>climbing</em>"
          ]
        }
      }
    ]
  }
...

1.1.9 分析

按照"interests"進行分組

GET /megacorp/employee/_search
{
  "aggs": {
    "all_interests": {//聚合桶名稱
      "terms": { "field": "interests" }
    }
  }
}

結果

...
  "aggregations": {
    "all_interests": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "music",
          "doc_count": 2
        },
        {
          "key": "forestry",
          "doc_count": 1
        },
        {
          "key": "sports",
          "doc_count": 1
        }
      ]
    }
  }
...

查詢 last_name=smith 並按照"interests"中的內容進行分組

GET /megacorp/employee/_search
{
  "query": {
    "match": { "last_name": "smith"}
  },
  "aggs": {
    "all_interests": {
      "terms": {"field": "interests" }
    }
  }
}

結果

...
  "aggregations": {
    "all_interests": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "music",
          "doc_count": 2
        },
        {
          "key": "sports",
          "doc_count": 1
        }
      ]
    }
  }
...

先聚合桶再進行度量

GET /megacorp/employee/_search
{
    "aggs" : {
        "all_interests" : {
            "terms" : { "field" : "interests" },
            "aggs" : {
                "avg_age" : {"avg" : { "field" : "age" }}
            }
        }
    }
}

結果

...
  "aggregations": {
    "all_interests": {//聚合桶名稱
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "music",
          "doc_count": 2,
          "avg_age": {
            "value": 28.5
          }
        },
        {
          "key": "forestry",
          "doc_count": 1,
          "avg_age": {
            "value": 35
          }
        },
        {
          "key": "sports",
          "doc_count": 1,
          "avg_age": {
            "value": 25
          }
        }
      ]
    }
  }
...

1.2.0 集群內的原理

1.2.1 集群健康

GET /_cluster/health

結果

  • green 所有的主分片和副本分片都正常運行。
  • yellow 所有的主分片都正常運行,但不是所有的副本分片都正常運行。
  • red 有主分片沒能正常運行。
{
  "cluster_name": "docker-cluster",
  "status": "yellow",//①
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 23,
  "active_shards": 23,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 20,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 53.48837209302325
}

1.2.2 添加索引

PUT /blogs
{
   "settings" : {
      "number_of_shards" : 3,//分片
      "number_of_replicas" : 1//副本
   }
}

1.3.0 數據輸入和輸出

1.3.1 文檔元數據

  • _index 放在哪個數據庫 這個名字必須小寫,不能以下划線開頭,不能包含逗號
  • _type 放在哪個表 可以是大寫或者小寫,但是不能以下划線或者句號開頭,不應該包含逗號,並且長度限制為256個字符
  • _id 文檔唯一標識 字符串類型,不指定會自動生成

1.3.2 索引文檔

指定id創建文檔

PUT /website/blog/123
{
  "title": "My first blog entry",
  "text":  "Just trying this out...",
  "date":  "2014/01/01"
}

不指定id創建文檔

PUT /website/blog
{
  "title": "My first blog entry",
  "text":  "Just trying this out...",
  "date":  "2014/01/01"
}

1.3.3 取回一個文檔

GET website/blog/123

結果

{
  "_index": "website",
  "_type": "blog",
  "_id": "123",
  "_version": 1,
  "found": true,
  "_source": {
    "title": "My first blog entry",
    "text": "Just trying this out...",
    "date": "2014/01/01"
  }
}

只返回原始文檔_source

GET website/blog/123/_source

結果

{
  "title": "My first blog entry",
  "text": "Just trying this out...",
  "date": "2014/01/01"
}

1.3.4 檢查文檔是否存在

使用HEAD代替GET,只返回請求頭,沒有請求體

HEAD /website/blog/123

返回 200 - OK

HEAD /website/blog/124

返回 404 - Not Found

1.3.5 更新整個文檔

刪除舊文檔,創建一個新文檔,如果不存在就創建一個新文檔

PUT /website/blog/123
{
  "title": "My first blog entry",
  "text":  "I am starting to get the hang of this...",
  "date":  "2014/01/02"
}

1.3.6 創建新文檔

加參數 存在不能創建

POST /website/blog/123?op_type=create
{
 "name":"taopanfeng" 
}

等同於下面

POST /website/blog/123/_create
{
 "name":"taopanfeng1" 
}

都是返回 結果

{
  "error": {
    "root_cause": [
      {
        "type": "version_conflict_engine_exception",
        "reason": "[blog][123]: version conflict, document already exists (current version [5])",
        "index_uuid": "reL04BFdQN-YCE3l9THqjA",
        "shard": "0",
        "index": "website"
      }
    ],
    "type": "version_conflict_engine_exception",
    "reason": "[blog][123]: version conflict, document already exists (current version [5])",
    "index_uuid": "reL04BFdQN-YCE3l9THqjA",
    "shard": "0",
    "index": "website"
  },
  "status": 409
}

不加參數

POST /website/blog/111
{
 "name":"taopanfeng" 
}

不存在 創建resultcreated,_version是1

{
  "_index": "website",
  "_type": "blog",
  "_id": "111",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}

存在 更新resultupdated,_version加1

{
  "_index": "website",
  "_type": "blog",
  "_id": "111",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 2,
  "_primary_term": 1
}

1.3.7 刪除文檔

DELETE /website/blog/123

已存在 結果 resultdeleted

{
  "_index": "website",
  "_type": "blog",
  "_id": "123",
  "_version": 6,
  "result": "deleted",//刪除成功
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 5,
  "_primary_term": 1
}

不存在 結果 resultnot_found

{
  "_index": "website",
  "_type": "blog",
  "_id": "123",
  "_version": 1,
  "result": "not_found",//未找到
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 6,
  "_primary_term": 1
}

1.3.8 處理沖突

兩個人在購買同一種物品,原庫存100,A讀取到,B也讀取到,B買了一件這個商品,
此時商品庫存為99,但是B讀取到的是100,B買了一件之后還是99,就形成了沖突

悲觀鎖 我修改之后,別人才可以修改
樂觀鎖 假設不可以沖突,如果讀取的時候被修改,就更新失敗
elasticsearch就是使用樂觀鎖

1.3.9 樂觀並發控制

創建一篇文章

PUT /website/blog/888/_create
{
 "name":"taopanfeng888" 
}

獲取數據

GET /website/blog/888

結果,其中_version就是版本號

{
  "_index": "website",
  "_type": "blog",
  "_id": "888",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "taopanfeng888"
  }
}

現在修改文檔,指定版本號為當前版本號1

PUT /website/blog/888?version=1
{
 "name":"taopanfeng888-update"
}

結果 修改成功 _version版本號+1 resultupdated

{
  "_index": "website",
  "_type": "blog",
  "_id": "888",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}

如果仍然執行上面 version=1
結果 修改失敗 因為需要指定version等於當前版本號才可以修改

{
  "error": {
    "root_cause": [
      {
        "type": "version_conflict_engine_exception",
        "reason": "[blog][888]: version conflict, current version [2] is different than the one provided [1]",
        "index_uuid": "reL04BFdQN-YCE3l9THqjA",
        "shard": "2",
        "index": "website"
      }
    ],
    "type": "version_conflict_engine_exception",
    "reason": "[blog][888]: version conflict, current version [2] is different than the one provided [1]",
    "index_uuid": "reL04BFdQN-YCE3l9THqjA",
    "shard": "2",
    "index": "website"
  },
  "status": 409
}

但是我們可以使用 version_type指定external來設置
但是指定的version要大於當前版本號,小於等於都不可以

PUT /website/blog/888?version=99&version_type=external
{
 "name":"taopanfeng888-v99" 
}

結果 修改成功 版本號也改為了99

{
  "_index": "website",
  "_type": "blog",
  "_id": "888",
  "_version": 99,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 2,
  "_primary_term": 1
}

1.3.10 文檔的部分更新

先添加一篇文章

PUT /website/blog/3/_create
{
  "name":"taopanfeng"
}

使用POST請求加上參數_update來對doc添加指定更新
如果doc中指定屬性不存在則添加
如果存在 age=26 又改為 age=27 則會更新成功
如果存在 age=27 又改為 age=27 則會不進行更新返回resultnoop
如果為多個字段 例如 age=27 and sex=man 更新其中任何一個就可以更新成功

POST /website/blog/3/_update
{
  "doc": {
    "age":26
  }
}

結果 更新成功 版本號+1 resultupdated

{
  "_index": "website",
  "_type": "blog",
  "_id": "3",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 2,
  "_primary_term": 1
}

檢索 GET /website/blog/3

{
  "_index": "website",
  "_type": "blog",
  "_id": "3",
  "_version": 2,
  "found": true,
  "_source": {
    "name": "taopanfeng",
    "age": 26
  }
}

1.3.11 取回多個文檔

查詢website/blog/2 website/blog/1並只顯示name字段 website/pageviews/1

GET /_mget
{
   "docs" : [
      {
         "_index" : "website",
         "_type" :  "blog",
         "_id" :    2
      },
      {
         "_index" : "website",
         "_type" :  "blog",
         "_id" :    1,
         "_source": "name"//只顯示"name"屬性
      },
      {
         "_index" : "website",
         "_type" :  "pageviews",//查詢類型為"pageviews"
         "_id" :    1
      }
   ]
}

結果

{
  "docs": [
    {
      "_index": "website",
      "_type": "blog",
      "_id": "2",
      "_version": 2,
      "found": true,
      "_source": {
        "name": "taopanfeng",
        "age": 26
      }
    },
    {
      "_index": "website",
      "_type": "blog",
      "_id": "1",
      "_version": 2,
      "found": true,
      "_source": {
        "name": "taopanfeng"//指定了"_source": "name"
      }
    },
    {
      "_index": "website",
      "_type": "pageviews",
      "_id": "1",
      "found": false//未找到
    }
  ]
}

查詢 id為2 1 55

GET /website/blog/_mget
{
   "ids" : [ "2", "1" ,"55"]
}

結果

{
  "docs": [
    {
      "_index": "website",
      "_type": "blog",
      "_id": "2",
      "_version": 2,
      "found": true,
      "_source": {
        "name": "taopanfeng",
        "age": 26
      }
    },
    {
      "_index": "website",
      "_type": "blog",
      "_id": "1",
      "_version": 2,
      "found": true,
      "_source": {
        "name": "taopanfeng",
        "age": 26
      }
    },
    {
      "_index": "website",
      "_type": "blog",
      "_id": "55",
      "found": false
    }
  ]
}

1.3.12 代價較小的批量操作

  • 每個操作都是獨立的,互不影響

bulk允許多次 create,index,update,delete
create 成功201 失敗409
index 成功201
update 成功200 失敗404 _id不存在
delete 成功200 失敗404 _id不存在

POST _bulk
{ "create":  { "_index": "website", "_type": "blog", "_id": "1" }}
{ "title":    "My first blog post" }
{ "create":  { "_index": "website", "_type": "blog", "_id": "2" }}
{ "title":    "My first blog post" }
{ "index":  { "_index": "website", "_type": "blog"}}
{ "title":    "My first blog post" }
{ "delete": { "_index": "website", "_type": "blog", "_id": "1" }}
{ "update":{ "_index": "website", "_type": "blog", "_id": "2" }}
{"doc":{"content":"I'm content...."}}

改進 都是操作同一個索引 類型

POST /website/blog/_bulk
{ "create":  {"_id": "1" }}
{ "title":    "My first blog post" }
{ "create":  {"_id": "2" }}
{ "title":    "My first blog post" }
{ "index":  {}}
{ "title":    "My first blog post" }
{ "delete": {"_id": "1" }}
{ "update":{"_id": "2" }}
{"doc":{"content":"I'm content...."}}

1.4.0 搜索——最基本的工具

1.4.1 多索引,多類型

//查找所有索引的所有文檔
GET _search

//查找cars索引的所有文檔
GET cars/_search

//*代表[0,多]
GET website,cars/_search
GET *sit*,*ar*/_search

//it開頭的索引的所有文檔
GET it*/_search

//查詢"site"結尾,"it"開頭,含有"ar"的所有索引的所有文檔
GET *site,it*,*ar*/_search

//查詢"item"和"website"索引的"blog"類型的文檔
GET item,website/blog/_search

//查找所有索引的"blog" "user"類型文檔
GET _all/blog,user/_search

1.4.2 分頁

size默認10 代表"hits"數組顯示的數量,最小為0,大於等於"total"都會顯示全部
from默認0 代表要跳過幾條
例如 一共五條 size=2 from=1 則只會顯示第2 3兩條數據

GET _search
{
 "size":10,
 "from":0
}

//等同於

GET _search?size=10&from=0

1.5.0 映射和分析

1.5.1 映射

獲取映射

GET cars/transactions/_mapping

結果

{
  "cars": {
    "mappings": {
      "transactions": {
        "properties": {
          "color": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "make": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "price": {
            "type": "long"
          },
          "sold": {
            "type": "date"
          }
        }
      }
    }
  }
}
  • 字符串類型
    • text 可分詞,不可聚合
    • keyword 可聚合,不可分詞
  • 基本數值類型
    • long、interger、short、byte、double、float、half_float
  • 日期類型
    • date 建議存為 long類型

創建映射字段
設置"index": "false"不能對該字段進行搜索
text類型默認會進行分詞,也可以指定分詞器"analyzer": "分詞器"
text想設置聚合需要設置 "fielddata": true

PUT 索引/類型/_mapping/
{
  "properties": {
    "字段名稱": {
      "type": "text",
      "analyzer": "ik_max_word"
    },
    "字段名稱": {
      "type": "keyword",
      "index": "false"
    },
    "字段名稱": {
      "type": "float"
    }
  }
}

查看映射關系

GET 索引/類型/_mapping

刪除索引

DELETE 索引

1.6.0 請求體查詢

1.6.1 查詢表達式

查詢全部索引的全部類型
請求體為空可以省略
全部索引 _all 可以省略

GET _search
GET _search
{}
GET _search
{
  "query": {
    "match_all": {}
  }
}
GET _all/_search
{}

指定索引 類型查詢
*可以表示零個或多個,cars可以匹配ca* *ar* *s

查詢多個索引的全部類型的全部文檔

GET 索引1,索引2/_search

查詢多個索引的多個類型的全部文檔

GET 索引1,索引2/類型1,類型2/_search

查詢 price=15000

GET cars/transactions/_search
{
  "query": {
    "match": {
      "price": 15000
    }
  }
}

1.6.2 最重要的查詢

match_all默認的查詢,匹配所有文檔

GET a1/student/_search
{
  "query": {
	"match_all": {}
  }
}

結果

{
  "took": 4,//查詢耗時4毫秒
  "timed_out": false,//沒有超時
  "_shards": {//分片
	"total": 5,//一共5個
	"successful": 5,//成功5個
	"skipped": 0,//跳過0個
	"failed": 0//失敗0個
  },
  "hits": {//查詢到的數據
	"total": 3,//查詢總條數
	"max_score": 1,//最大得分1
	"hits": [//查詢到的數據所有文檔
	  {//一個文檔
		"_index": "a1",//數據庫
		"_type": "student",//表
		"_id": "2",//注解 每個文檔的唯一標識
		"_score": 1,//得分是1 滿分是最大得分
		"_source": {//查詢到的數據 包括 字段 字段值 -> k:v
		  "name": "大米手機",
		  "age": 22
		}
	  },
	  {
		"_index": "a1",
		"_type": "student",
		"_id": "CA2Yqm0Bmr19jrNQ7nRL",
		"_score": 1,
		"_source": {
		  "name": "小米手機",
		  "age": 11
		}
	  },
	  {
		"_index": "a1",
		"_type": "student",
		"_id": "3",
		"_score": 1,
		"_source": {
		  "name": "小米電視4K",
		  "age": 33,
		  "address": "安徽阜陽小米酒店101"
		}
	  }
	]
  }
}

match查詢
text類型會分詞查詢字符串
數組,日期,布爾或not_analyzed字符串字段,就會精准匹配

{ "match": { "tweet": "About Search" }}

{ "match": { "age":    26           }}

{ "match": { "date":   "2014-09-01" }}

{ "match": { "public": true         }}

{ "match": { "tag":    "full_text"  }}
//測試筆記
match
查詢 name=小米電視
因為使用了分詞,默認是or 所以可匹配 -> 小米 or 電視
GET a1/_search
{
  "query": {
    "match": {
      "name": "小米電視"
    }
  }
}

查詢分詞,指定and可匹配 -> 小米 and 電視
GET a1/_search
{
  "query": {
    "match": {
      "name": {
        "query": "小米電視",
        "operator": "and"
      }
    }
  }
}

可以指定分詞的個數,
1 -> 匹配任意一個詞
2 -> 匹配任意兩個詞
3 -> 因為超過了分詞量,所以匹配不到
GET a1/_search
{
  "query": {
    "match": {
      "name": {
        "query": "小米電視",
        "minimum_should_match": 1
      }
    }
  }
}

3x0.66=1.98,因為1.98<2 所以匹配任意一個
GET a1/_search
{
  "query": {
    "match": {
      "name": {
        "query": "小米智能電視",
        "minimum_should_match": "66%"
      }
    }
  }
}

3x0.67=2.01,因為2.01>2 所以匹配任意兩個
GET a1/_search
{
  "query": {
    "match": {
      "name": {
        "query": "小米智能電視",
        "minimum_should_match": "67%"
      }
    }
  }
}

multi_match多字段搜索 name like '%大米%' or f1 like '%大米%'

GET a1/student/_search
{
  "query": {
    "multi_match": {
      "query": "大米",
      "fields": ["name","f1"]
    }
  }
}

range查詢 10<=age<=20
lt <
lte <=
gt >
gte >=

GET a1/student/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 10,
        "lte": 20
      }
    }
  }
}

term精准匹配

GET a1/student/_search
{
  "query": {
    "term": {
      "age": {
        "value": 11
      }
    }
  }
}

terms多值匹配,滿足一個即可

GET a1/student/_search
{
  "query": {
    "terms": {
      "age": [11,22,77]
    }
  }
}

exists查詢存在指定字段的文檔

GET a1/student/_search
{
  "query": {
    "exists":{
      "field":"address"
    }
  }
}

1.6.3 組合多查詢

must[{1},{2}] 滿足所有
查詢"name"分詞有"小米"並且"age"等於11或者22

GET a1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "小米"
          }
        },
        {
          "terms": {
            "age": [11,22]
          }
        }
      ]
    }
  }
}

must_not[{1},{2}] 不滿足所有
查詢"name"分詞沒有"小米並且"age"不等於11或者22

GET a1/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "name": "小米"
          }
        },
        {
          "terms": {
            "age": [11,22]
          }
        }
      ]
    }
  }
}

should[{1},{2}] 滿足任意一個

GET a1/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "name": "小米"
          }
        },
        {
          "terms": {
            "age": [11,22]
          }
        }
      ]
    }
  }
}

結果過濾

查詢字段只顯示 name age
並且查詢 age in [11,22,77]
GET a1/_search
{
  "_source": ["name","age"],
  "query": {
    "terms": {
      "age": [11,22,77]
    }
  }
}

查詢所有 只顯示"address"字段,沒有此字段的顯示空
GET a1/_search
{
  "_source": {
    "includes": ["address"]
  }
}

查詢所有只除了"address"字段,其他全顯示
GET a1/_search
{
  "_source": {
    "excludes": ["address"]
  }
}

過濾 filter會將評分設置為0,不會使評分對結果影響
查詢"name"=小米並且10<=age<=20

GET a1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "小米"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gte": 10,
            "lte": 20
          }
        }
      }
    }
  }
}

1.7.0 排序與相關性

1.7.1 排序

排序
查詢"name"=小米並按照年齡降序

GET a1/_search
{
  "query": {
    "match": {
      "name": "小米"
    }
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

多字段排序 先排序第一個,第一個相同才會對第二個進行排序

GET a1/_search
{
  "query": {
    "match": {
      "name": "小米"
    }
  },
  "sort": [
    {"age": {"order": "desc"}},
	{"_score": {"order": "asc"}}
  ]
}

1.7.2 什么是相關性?

  • 檢索詞頻率 一個詞出現次數多相關性越高
  • 反向文檔頻率 一個詞只在一個文檔中存在,相關性高,如果在十個文檔中存在,相關性低
  • 字段長度准則 "a"在"ab"中相關性比在"abcd"中要高,長度越短越高

1.8.0 索引管理

1.8.1 創建一個索引

手動創建索引

PUT 索引名
{
	"settings":{...},
	"mappings":{
		"類型名稱1":{...},
		"類型名稱2":{...}
	}
}

設置禁止自動創建索引
找到 config/elasticsearch.yml
在每個節點下添加 action.auto_create_index: false

1.8.2 刪除一個索引

  • 刪除指定索引

DELETE /my_index

  • 刪除多個索引

DELETE /index_one,index_two
DELETE /index_*

+刪除全部索引

DELETE /_all
DELETE /*

如果不想使用 _all 或 * 來批量刪除索引
設置elasticsearch.yml配置action.destructive_requires_name: true

1.8.3 索引設置

number_of_shards 分片默認5
number_of_replicas 副本默認1

PUT /my_temp_index
{
    "settings": {
        "number_of_shards" :   1,
        "number_of_replicas" : 0
    }
}

可以修改副本,不能修改分片

PUT /my_temp_index/_settings
{
    "number_of_replicas": 1
}

1.8.4 索引別名

創建一個索引g

PUT g

查看g

GET g

結果 此時aliases為空

{
  "g": {
    "aliases": {},
    "mappings": {},
    "settings": {
      "index": {
        "creation_date": "1570706049853",
        "number_of_shards": "5",
        "number_of_replicas": "1",
        "uuid": "N0uDV7bmSRGBYG3Vnk51Og",
        "version": {
          "created": "6050099"
        },
        "provided_name": "g"
      }
    }
  }
}

創建一個別名g1

PUT g/_alias/g1

此時執行GET gGET g1是一樣的效果,返回

{
  "g": {
    "aliases": {
      "g1": {}
    },
...

批量操作
指定g刪除別名g1
指定g添加別名g2
指定g添加別名g3

POST _aliases
{
  "actions": [
    {"remove": {"index": "g","alias": "g1"}},
    {"add": {"index": "g","alias": "g2"}},
    {"add": {"index": "g","alias": "g3"}}

  ]
}

結果

{
  "g": {
    "aliases": {
      "g2": {},
      "g3": {}
    },
...

2.0.0 深入搜索

2.1.0 結構化搜索

2.1.1 精確值查找

先設置類型,設置productID不分詞,可以精確查找

PUT my_store/products/_mapping
{
  "properties": {
    "productID":{
      "type": "keyword"
    }
  }
}

添加數據

POST /my_store/products/_bulk
{ "index": { "_id": 1 }}
{ "price" : 10, "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20, "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30, "productID" : "QQPX-R-3956-#aD8" }

精確查找 price=20

GET my_store/products/_search
{
  "query": {
    "term" : {
      "price" : 20
    }
  }
}

不希望評估計算,進行排除,包括的計算
這樣做可以優化速度,統計評分設置 1

GET /my_store/products/_search
{
    "query" : {
        "constant_score" : { 
            "filter" : {
                "term" : { 
                    "price" : 20
                }
            }
        }
    }
}

查詢 productID=XHDK-A-1293-#fJ3

GET /my_store/products/_search
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "term" : {
                    "productID" : "XHDK-A-1293-#fJ3"
                }
            }
        }
    }
}

2.1.2 組合過濾器

  • 布爾過濾器 bool
    • must:[{1},{2}] 都滿足 and
    • must_not:[{1},{2}] 都不滿足 not
    • should:[{1},{2}] 滿足一個即可 or

組合 (price=20 or productID=XHDK-A-1293-#fJ3) and price!=30

GET /my_store/products/_search
{
   "query" : {
      "bool" : {
        "should" : [
           { "term" : {"price" : 20}}, 
           { "term" : {"productID" : "XHDK-A-1293-#fJ3"}} 
        ],
        "must_not" : {
           "term" : {"price" : 30} 
        }
     }
   }
}

嵌套 price=30 and (productID!=JODL-X-1937-#pV7)

GET my_store/products/_search
{
  "query": {
    "bool": {
      "must": [
        {"term": {"price": 30}},
        {"bool": {"must_not": [
          {"term": {"productID": "JODL-X-1937-#pV7"}}
        ]}}
      ]
    }
  }
}

2.1.3 查找多個精確值

查找 price in (20,30)

GET /my_store/products/_search
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "terms" : { 
                    "price" : [20, 30]
                }
            }
        }
    }
}

2.1.4 范圍

數值范圍 price BETWEEN 20 AND 40

GET /my_store/products/_search
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "range" : {
                    "price" : {
                        "gte" : 20,
                        "lt"  : 40
                    }
                }
            }
        }
    }
}

日期范圍
定義日期類型

PUT my_store
{
  "mappings": {
    "products": {
      "properties": {
        "date":{
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        }
      }
    }
  }
}

插入數據

POST /my_store/products/_bulk
{ "index": { "_id": 5 }}
{  "date":"2019-10-10 15:35:20"}
{ "index": { "_id": 6 }}
{ "date":"2019-10-09 16:32:12"}
{ "index": { "_id": 7 }}
{ "date":"2019-09-09 16:32:12"}

可以使用now表示當前時間
因為我們定義格式為"yyyy-MM-dd HH:mm:ss" 所有y M d H m s表示年月日時分秒

查詢date大於等於當前時間的上一個月

GET my_store/products/_search
{
  "query": {
    "range": {
      "date": {
        "lte": "now-1M"
      }
    }
  }
}

指定時間 查詢大於"2019-09-10 11:11:11"

GET my_store/products/_search
{
  "query": {
    "range": {
      "date": {
        "gt": "2019-09-10 11:11:11"
      }
    }
  }
}

2.1.5 處理null值

查詢存在date字段的文檔,並且date IS NOT NULL

GET my_store/products/_search
{
  "query": {
    "exists":{
      "field":"date"
    }
  }
}

2.2.0 全文檢索

寫到這里,就不往下寫了,這時我去看7.4的官方文檔了


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM