elasticsearch多種搜索方式

本文轉載自查看原文 2018-08-30 14:22 6523 elasticsearch

簡要

1、query string search
2、query DSL
3、query filter
4、full-text search
5、phrase search
6、highlight search

1、query string search

搜索全部商品：GET /ecommerce/product/_search

query string search的由來，因為search參數都是以http請求的query string來附帶的。

搜索商品名稱中包含yagao的商品，而且按照售價降序排序：GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

適用於臨時的在命令行使用一些工具，比如curl，快速的發出請求，來檢索想要的信息；

但是如果查詢請求很復雜，是很難去構建的，在生產環境中，幾乎很少使用query string search。

took：耗費了幾毫秒
timed_out：是否超時，這里是沒有
_shards：數據拆成了5個分片，所以對於搜索請求，會打到所有的primary shard（或者是它的某個replica shard也可以）
hits.total：查詢結果的數量，3個document
hits.max_score：score的含義，就是document對於一個search的相關度的匹配分數，越相關，就越匹配，分數也高
hits.hits：包含了匹配搜索的document的詳細數據

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

View Code

GET /test_index/test_type/_search?q=test_field:test

GET /test_index/test_type/_search?q=+test_field:test

GET /test_index/test_type/_search?q=-test_field:test

一個是掌握q=field:search content的語法，還有一個是掌握+和-的含義,+是必須包含，-是不包含

_all metadata的原理和作用

GET /test_index/test_type/_search?q=test

直接可以搜索所有的field，任意一個field包含指定的關鍵字就可以搜索出來。

2、query DSL

DSL：Domain Specified Language，特定領域的語言

http request body：請求體，可以用json的格式來構建查詢語法，比較方便，可以構建各種復雜的語法，比query string search肯定強大多了

查詢所有的商品

GET /ecommerce/product/_search

{

"query": { "match_all": {} }

}

查詢名稱包含yagao的商品，同時按照價格降序排序

GET /ecommerce/product/_search

{

"query" : {

"match" : {

"name" : "yagao"

}

"sort": [

{ "price": "desc" }

]

}

分頁查詢

分頁查詢商品，總共3條商品，假設每頁就顯示1條商品，現在顯示第2頁，所以就查出來第2個商品.from://從第幾個商品開始查

GET /ecommerce/product/_search

{

"query": { "match_all": {} },

"from": 1,

"size": 1

}

指定要查詢出來商品的名稱和價格就可以

GET /ecommerce/product/_search

{

"query": { "match_all": {} },

"_source": ["name", "price"]

}

更加適合生產環境的使用，可以構建復雜的查詢

Scoll滾動搜索

如果一次性要查出來比如10萬條數據，那么性能會很差，此時一般會采取用scoll滾動查詢，一批一批的查，直到所有數據都查詢完處理完

使用scoll滾動搜索，可以先搜索一批數據，然后下次再搜索一批數據，以此類推，直到搜索出全部的數據來

scoll搜索會在第一次搜索的時候，保存一個當時的視圖快照，之后只會基於該舊的視圖快照提供數據搜索，如果這個期間數據變更，是不會讓用戶看到的

采用基於_doc進行排序的方式，性能較高

每次發送scroll請求，我們還需要指定一個scoll參數，指定一個時間窗口，每次搜索請求只要在這個時間窗口內能完成就可以了

每次取3條

GET /test_index/test_type/_search?scroll=1m

{

"query": {

"match_all": {}

"sort": [ "_doc" ],

"size": 3

}

獲得的結果會有一個scoll_id，下一次再發送scoll請求的時候，必須帶上這個scoll_id

GET /_search/scroll

{

"scroll": "1m",

"scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3"

}

scoll，看起來挺像分頁的，但是其實使用場景不一樣。分頁主要是用來一頁一頁搜索，給用戶看的；scoll主要是用來一批一批檢索數據，讓系統進行處理的。

組合多個搜索條件

GET /website/article/_search

{

"query": {

"bool": {

"must": [ //title必須包含elasticsearch

{

"match": {

"title": "elasticsearch"

}

"should": [ //content可以包含elasticsearch也可以不包含

{

"match": {

"content": "elasticsearch"

}

"must_not": [ //author_id必須不為111

{

"match": {

"author_id": 111

}

]

}

1、match all

GET /_search

{

"query": {

"match_all": {}

}

2、match

GET /_search

{

"query": { "match": { "title": "my elasticsearch article" }}

}

3、multi match

GET /test_index/test_type/_search

{

"query": {

"multi_match": {

"query": "test", //搜索的文本

"fields": ["test_field", "test_field1"] //多個field上面搜索

}

4、range query

GET /company/employee/_search

{

"query": {

"range": {

"age": {

"gte": 30

}

5、term query

//把這個字段當成exact value去查詢(前提條件：手動創建mapping的時候需要指定no_analy不分詞去建立索引，這樣才可以用test hello在term搜到)

GET /test_index/test_type/_search

{

"query": {

"term": {

"test_field": "test hello"

}

6、terms query

GET /_search

{

"query": { "terms": { "tag": [ "search", "full_text", "nosql" ] }} //對tag字段指定多個搜索詞

}

3、query filter

搜索商品名稱包含yagao，而且售價大於25元的商品

GET /ecommerce/product/_search

{

"query" : {

"bool" : {

"must" : {

"match" : {

"name" : "yagao"

}

"filter" : {

"range" : {

"price" : { "gt" : 25 }

}

{

"bool": {

"must": { "match": { "title": "how to make millions" }},

"must_not": { "match": { "tag": "spam" }},

"should": [

{ "match": { "tag": "starred" }}

"filter": {

"range": { "date": { "gte": "2014-01-01" }}

}

{

"bool": {

"must": { "match": { "title": "how to make millions" }},

"must_not": { "match": { "tag": "spam" }},

"should": [

{ "match": { "tag": "starred" }}

"filter": {

"bool": {

"must": [

{ "range": { "date": { "gte": "2014-01-01" }}},

{ "range": { "price": { "lte": 29.99 }}}

"must_not": [

{ "term": { "category": "ebooks" }}

]

}

GET /company/employee/_search

{

"query": {

"constant_score": { //constant_score是固定語法單純使用filter的時候需要加上的

"filter": {

"range": {

"age": {

"gte": 30

}

4、full-text search

GET /ecommerce/product/_search

{

"query" : {

"match" : {

"producer" : "yagao producer"

}

5、phrase search（短語搜索）

跟全文檢索相對應，相反，全文檢索會將輸入的搜索串拆解開來，去倒排索引里面去一一匹配，只要能匹配上任意一個拆解后的單詞，就可以作為結果返回

phrase search，要求輸入的搜索串，必須在指定的字段文本中，完全包含一模一樣的，才可以算匹配，才能作為結果返回

GET /ecommerce/product/_search

{

"query" : {

"match_phrase" : {

"producer" : "yagao producer"

}

6、highlight search

GET /ecommerce/product/_search

{

"query" : {

"match" : {

"producer" : "producer"

}

"highlight": {

"fields" : {

"producer" : {}

}

7、判斷搜索是否合法

//判斷搜索是否合法，如果不合法問題在哪里

GET /test_index/test_type/_validate/query?explain

{

"query": {

"math": {

"test_field": "test"

}

{

"valid": false,

"error": "org.elasticsearch.common.ParsingException: no [query] registered for [math]"

}

8、排序

1、默認排序規則

默認情況下，是按照_score降序排序的

然而，某些情況下，可能沒有有用的_score，比如說filter

GET /_search

{

"query" : {

"bool" : {

"filter" : {

"term" : {

"author_id" : 1

}

當然，也可以是constant_score

GET /_search

{

"query" : {

"constant_score" : {

"filter" : {

"term" : {

"author_id" : 1

}

2、定制排序規則

GET /company/employee/_search

{

"query": {

"constant_score": {

"filter": {

"range": {

"age": {

"gte": 30

}

"sort": [

{

"join_date": {

"order": "asc"

}

]

}

問題：如果對一個string field進行排序，結果往往不准確，因為分詞后是多個單詞，再排序就不是我們想要的結果了

通常解決方案是，將一個string field建立兩次索引，一個分詞，用來進行搜索；一個不分詞，用來進行排序

PUT /website

{

"mappings": {

"article": {

"properties": {

"title": {

"type": "text", //分詞索引

"fields": {

"raw": { //不分詞索引

"type": "string",

"index": "not_analyzed"

}

"fielddata": true //正排索引

"content": {

"type": "text"

"post_date": {

"type": "date"

"author_id": {

"type": "long"

}

PUT /website/article/1

{

"title": "first article",

"content": "this is my second article",

"post_date": "2017-01-01",

"author_id": 110

}

{

"took": 2,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

"hits": {

"total": 3,

"max_score": 1,

"hits": [

{

"_index": "website",

"_type": "article",

"_id": "2",

"_score": 1,

"_source": {

"title": "first article",

"content": "this is my first article",

"post_date": "2017-02-01",

"author_id": 110

}

{

"_index": "website",

"_type": "article",

"_id": "1",

"_score": 1,

"_source": {

"title": "second article",

"content": "this is my second article",

"post_date": "2017-01-01",

"author_id": 110

}

{

"_index": "website",

"_type": "article",

"_id": "3",

"_score": 1,

"_source": {

"title": "third article",

"content": "this is my third article",

"post_date": "2017-03-01",

"author_id": 110

}

]

}

GET /website/article/_search

{

"query": {

"match_all": {}

"sort": [

{

"title.raw": { //拿未分詞索引的去排，上面有創建了

"order": "desc"

}

]

}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 ElasticSearch多種搜索方式 Elasticsearch系列---常見搜索方式與聚合分析 Elasticsearch-2.4.3的單節點安裝（多種方式圖文詳解）十九種Elasticsearch字符串搜索方式終極介紹十九種Elasticsearch字符串搜索方式終極介紹【Elasticsearch 7 搜索之路】（一）什么是 Elasticsearch？ Shell傳參的多種方式 adb的多種連接方式（二）多種方式安裝GitLabRunner ibatis 多種傳參方式