Elasticsearch復雜搜索（排序、分頁、高亮、模糊查詢、精確查詢）

本文轉載自查看原文 2021-05-20 17:18 1139 Elasticsearch

如果不了解Es的基本使用，可以查看之前的文章。Elasticsearch 索引及文檔的基本操作

在查詢之前可以使用Bulk API 批量插入文檔數據數據來源

查詢數據

match query

match會使用分詞器解析！先分析文檔，然后再通過分析的文檔進行查詢。

GET /student/_search
{
  "query": {
    "match": {
      "name": "山西"
    }
  }
}

上面的搜索也可以這么實現

GET /student/_search?q=name:"山西"

查詢結果展示有三個名字中包含 “山西” 的學生：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.7133499,
    "hits" : [
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.7133499,
        "_source" : {
          "name" : "山西太原-張三",
          "age" : "23",
          "address" : {
            "city" : "太原",
            "province" : "山西"
          }
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.7133499,
        "_source" : {
          "name" : "山西長治-李四",
          "age" : "24",
          "address" : {
            "city" : "長治",
            "province" : "山西"
          }
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.7133499,
        "_source" : {
          "name" : "山西呂梁-王五",
          "age" : "25",
          "address" : {
            "city" : "呂梁",
            "province" : "山西"
          }
        }
      }
    ]
  }
}

描述

query : 表示查詢。

match : 要匹配的條件信息。

name ：要查詢的信息

hits --> total

value : 查詢出兩條數據
ralation : 關系是 eq，相等

max_source : 最大分值

hits : 索引和文檔的信息，查詢出來的結果總數，就是查詢出來的具體文檔。

我們可以根據每個文檔的 _source 來判斷那條數據更加符合預期結果。

在使用mutch查詢時，默認的操作是 OR，下面兩個查詢的結果是相同的：

GET student/_search
{
    "query": {
        "match": {
            "name": {
                "query": "山西長治",
                "operator": "or"
            }
        }
    }
}

GET student/_search
{
    "query": {
        "match": {
            "name": "山西長治"
        }
    }
}

因為在使用mutch操作時，operator 默認值為 OR，上面的查詢為只要任何文檔匹配：山西長治其中任何一個字將被顯示。

可以通過設置 minimum_should_match 參數來設置至少匹配的term，比如：

GET student/_search
{
    "query": {
        "match": {
            "name": {
                "query": "山西長治",
                "operator": "or",
                "minimum_should_match": 3
            }
        }
    }
}

只有匹配到山西長治這四個字其中的三個字的文檔才會被顯示。

改為 and 之后，只有一個文檔會被查詢到：

GET student/_search
{
  "query": {
    "match": {
      "name": {
        "query": "山西長治",
        "operator": "and"
      }
    }
  }
}

Ids query

使用多個id批量查詢文檔

GET student/_search
{
  "query": {
    "ids": {
      "values": [1,2,3]
    }
  }
}

上面的查詢將返回 id 為 1，2，3的文檔。

multi_match

multi_match 查詢建立在 match 查詢的基礎上，允許多字段查詢。

在上面的搜索中，通過指定一個 field 來進行搜索。在很多情況下，並不知道那個 field 含有要查詢的關鍵字，這種情況就可以使用 multi_match 來查詢。

GET student/_search
{
    "query": {
        "multi_match": {
            "query": "山西長治",
            "fields": [
                "name",
                "address.city^3",
                "address.province"
            ],
            "type": "best_fields"
        }
    }
}

將field：name、city、province 進行檢索，並對 city 中含有山西長治的文檔的分數進行三倍加權。返回結果為：

{
    "took" : 0,
    "timed_out" : false,
    "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
    },
    "hits" : {
        "total" : {
            "value" : 3,
            "relation" : "eq"
        },
        "max_score" : 7.223837,
        "hits" : [
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "2",
                "_score" : 7.223837,
                "_source" : {
                    "name" : "山西長治-李四",
                    "age" : "24",
                    "address" : {
                        "city" : "長治",
                        "province" : "山西"
                    }
                }
            },
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 0.7133499,
                "_source" : {
                    "name" : "山西太原-張三",
                    "age" : "23",
                    "address" : {
                        "city" : "太原",
                        "province" : "山西"
                    }
                }
            },
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 0.7133499,
                "_source" : {
                    "name" : "山西呂梁-王五",
                    "age" : "25",
                    "address" : {
                        "city" : "呂梁",
                        "province" : "山西"
                    }
                }
            }
        ]
    }
}

Prefix query

返回在提供的字段中返回包含特定前綴的文檔

GET student/_search
{
    "query": {
        "prefix": {
            "address.city": {
                "value": "呂"
            }
        }
    }
}

查詢城市開頭為呂的文檔

{
    "took" : 2,
    "timed_out" : false,
    "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
    },
    "hits" : {
        "total" : {
            "value" : 1,
            "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 1.0,
                "_source" : {
                    "name" : "山西呂梁-王五",
                    "age" : "25",
                    "address" : {
                        "city" : "呂梁",
                        "province" : "山西"
                    }
                }
            }
        ]
    }
}

Term query

term 會在給定字段中進行精確的字段匹配，因此需要提供准確的查詢條件以獲取正確的結果

GET /student/_search
{
    "query": {
        "term": {
            "name.keyword": "山西太原-張三"
        }
    }
}

這里使用 name.keyword 來對 "山西太原-張三" 這個條件進行精確查詢匹配文檔：

{
    "took" : 0,
    "timed_out" : false,
    "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
    },
    "hits" : {
        "total" : {
            "value" : 1,
            "relation" : "eq"
        },
        "max_score" : 1.2039728,
        "hits" : [
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 1.2039728,
                "_source" : {
                    "name" : "山西太原-張三",
                    "age" : "23",
                    "address" : {
                        "city" : "太原",
                        "province" : "山西"
                    }
                }
            }
        ]
    }
}

Terms query

如果想用對個值進行精確查詢，可以使用terms進行查詢。類似於 SQL中的 in 語法

GET student/_search
{
    "query": {
        "terms": {
            "address.city.keyword": [
                "長治",
                "廣州"
            ]
        }
    }
}

上面的查詢結果將展示 address.city.keyword 里含有長治和廣州的所有文檔。

復合查詢

復合查詢是將上面的單個查詢組合起來形成更復雜的查詢。

一般格式為：

POST _search
{
    "query": {
        "bool" : {
            "must" : {
                "term" : { "user" : "kimchy" }
            },
            "filter": {
                "term" : { "tag" : "tech" }
            },
            "must_not" : {
                "range" : {
                    "age" : { "gte" : 10, "lte" : 20 }
                }
            },
            "should" : [
                { "term" : { "tag" : "wow" } },
                { "term" : { "tag" : "elasticsearch" } }
            ],
            "minimum_should_match" : 1,
            "boost" : 1.0
        }
    }
}

復合查詢是由 bool 下面的 must filter must_not should 組成，並且可以通過 minimum_should_match 來指定文檔必須匹配的數量或者百分比。如果布爾查詢包含至少一個 should 子句，並且沒有 must 或 filter 子句，則默認值為1。否則，默認值為0。

must

must 相當於SQL中的 and 操作。

使用復合查詢城市為長治，年齡為24的文檔數據

GET student/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "address.city": "長治"
                    }
                },
                {
                    "match": {
                        "age": "24"
                    }
                }
            ]
        }
    }
}

must_not

查詢所有省份不在山西的文檔，返回結果只剩下了一個廣州：

GET student/_search
{
    "query": {
        "bool": {
            "must_not": [
                {
                    "match": {
                        "address.province": "山西"
                    }
                }
            ]
        }
    }
}

filter

使用filter過濾年齡在24~25之間的文檔

GET student/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "age": {
              "gte": 24,
              "lte": 25
            }
          }
        }
      ]
    }
  }
}

gt : 大於
gte : 大於等於
lt：小於
lte：小於等於

should

should 表示或的意思，相當於SQL中的 OR。

查詢省份在山西的文檔，如果name含有張三，相關性會更高，搜索結果會靠前。

GET student/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "address.province": "山西"
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "name": "李四"
          }
        }
      ]
    }
  }
}

返回結果可以看到 name為山西長治-李四的文檔排在最前：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 3.1212955,
    "hits" : [
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 3.1212955,
        "_source" : {
          "name" : "山西長治-李四",
          "age" : "24",
          "address" : {
            "city" : "長治",
            "province" : "山西"
          }
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.7133499,
        "_source" : {
          "name" : "山西太原-張三",
          "age" : "23",
          "address" : {
            "city" : "太原",
            "province" : "山西"
          }
        }
      },
      {
        "_index" : "student",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.7133499,
        "_source" : {
          "name" : "山西呂梁-王五",
          "age" : "25",
          "address" : {
            "city" : "呂梁",
            "province" : "山西"
          }
        }
      }
    ]
  }
}

通配符查詢

使用 wildcard 查詢一個字符串中包含的字符，相當於SQL中的 like

GET student/_search
{
    "query": {
        "wildcard": {
            "name": {
                "value": "*王"
            }
        }
    }
}

查詢結果為：

{
    "took" : 0,
    "timed_out" : false,
    "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
    },
    "hits" : {
        "total" : {
            "value" : 1,
            "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 1.0,
                "_source" : {
                    "name" : "山西呂梁-王五",
                    "age" : "25",
                    "address" : {
                        "city" : "呂梁",
                        "province" : "山西"
                    }
                }
            }
        ]
    }
}

分頁及排序

查詢省份為山西的文檔，按照年齡倒序排列並分頁展示

GET student/_search
{
    "query": {
        "match": {
            "address.province": "山西"
        }
    },
    "sort": [
        {
            "age.keyword": {
                "order": "desc"
            }
        }
    ],
    "from": 2,
    "size": 2
}

from : 起始頁，下標從0開始。

size : 每頁顯示多少條

高亮查詢

使用 highlight 高亮查詢並且自定義高亮字段。並通過 pre_tags 和 post_tags 修改高亮文本前后綴。

GET student/_search
{
    "query": {
        "match": {
            "name": "張三"
        }
    },
    "highlight": {
        "pre_tags": "<br>", 
        "post_tags": "</br>", 
        "fields": {
            "name": {}
        }
    }
}

返回結果

{
    "took" : 0,
    "timed_out" : false,
    "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
    },
    "hits" : {
        "total" : {
            "value" : 1,
            "relation" : "eq"
        },
        "max_score" : 2.4079456,
        "hits" : [
            {
                "_index" : "student",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 2.4079456,
                "_source" : {
                    "name" : "山西太原-張三",
                    "age" : 23,
                    "address" : {
                        "city" : "太原",
                        "province" : "山西"
                    }
                },
                "highlight" : {
                    "name" : [
                        "山西太原-<br>張</br><br>三</br>"
                    ]
                }
            }
        ]
    }
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Solr之精確、匹配、排序、模糊查詢-yellowcong jpa 多條件模糊查詢，分頁並排序 Linq的模糊查詢（包含精確模糊查詢） spring boot jpa 復雜查詢動態查詢連接and和or 模糊查詢分頁查詢 Elasticsearch深分頁以及排序查詢問題 DSL查詢ES結果分頁和搜索關鍵字高亮顯示 elasticsearch之高亮查詢 elasticsearch之高亮查詢 ES入門 (9) 語法（7）DQL（4）多字段排序/高亮查詢/分頁查詢/聚合查詢/桶聚合查詢 mongotemplate mongodb的各種操作模糊查詢精確查詢