elasticsearch-dsl查詢


接續上篇,本篇使用python的elasticsearch-dsl庫操作elasticsearch進行查詢。

7.查詢

Elasticsearch是功能非常強大的搜索引擎,使用它的目的就是為了快速的查詢到需要的數據。

查詢分類:

  • 基本查詢:使用es內置查詢條件進行查詢
  • 組合查詢:把多個查詢組合在一起進行復合查詢
  • 過濾:查詢同時,通過filter條件在不影響打分的情況下篩選數據

7.1、基本查詢

    • 查詢前先創建一張表
       1 PUT chaxun
       2 {
       3   "mappings": {
       4     "job":{
       5       "properties": {
       6         "title":{
       7           "store": true,
       8           "type": "text",
       9           "analyzer": "ik_max_word"
      10         },
      11         "company_name":{
      12           "store": true,
      13           "type": "keyword"
      14         },
      15         "desc":{
      16           "type": "text"
      17         },
      18         "comments":{
      19           "type":"integer"
      20         },
      21         "add_time":{
      22           "type":"date",
      23           "format": "yyyy-MM-dd"
      24         }
      25       }
      26     }
      27   }
      28 }

      表截圖:

    • match查詢
      1 GET chaxun/job/_search
      2 {
      3   "query": {
      4     "match": {
      5       "title": "python"
      6     }
      7   }
      8 }
      1 s = Search(index='chaxun').query('match', title='python')
      2 response = s.execute()
    • term查詢 

      term查詢不會對查詢條件進行解析(分詞)

      1 GET chaxun/job/_search
      2 {
      3   "query": {
      4     "term":{
      5       "title":"python爬蟲"
      6     }
      7   }
      8 }
      1 s = Search(index='chaxun').query('term', title='python爬蟲')
      2 response = s.execute()
    • terms查詢
      1 GET chaxun/job/_search
      2 {
      3   "query": {
      4     "terms":{
      5       "title":["工程師", "django", "系統"]
      6     }
      7   }
      8 }
      1 s = Search(index='chaxun').query('terms', title=['django', u'工程師', u'系統'])
      2 response = s.execute()
    • 控制查詢的返回數量
       1 GET chaxun/job/_search
       2 {
       3   "query": {
       4     "term":{
       5       "title":"python"
       6     }
       7   },
       8   "from":1,
       9   "size":2
      10 }
      1 s = Search(index='chaxun').query('terms', title=['django', u'工程師', u'系統'])[0:2]
      2 response = s.execute()
    • match_all 查詢所有
      1 GET chaxun/job/_search
      2 {
      3   "query": {
      4     "match_all": {}
      5   }
      6 }
      1 s = Search(index='chaxun').query('match_all')
      2 response = s.execute()
    • match_phrase短語查詢
       1 GET chaxun/job/_search
       2 {
       3   "query": {
       4     "match_phrase": {
       5       "title": {
       6         "query": "python系統",
       7         "slop": 3
       8       }
       9     }
      10   }
      11 }
      1 s = Search(index='chaxun').query('match_phrase', title={"query": u"elasticsearch引擎", "slop": 3})
      2 response = s.execute()

      注釋:將查詢條件python系統”分詞成[“python”, “系統”],結果需同時滿足列表中分詞短語,“slop”指定分詞詞距,匹配結果需不超過slop,比如“python打造推薦引擎系統”,如果slop小於6則無法匹配。

    • multi_match查詢
      1 GET chaxun/job/_search
      2 {
      3   "query": {
      4     "multi_match": {
      5       "query": "python",
      6       "fields": ["title^3", "desc"]
      7     }
      8   }
      9 }
      1 q = Q('multi_match', query="python", fields=["title", "desc"])
      2 s = Search(index='chaxun').query(q)
      3 response = s.execute()

      注釋:指定查詢多個字段,”^3”指定”title”權重是”desc”3倍。

    • 指定返回字段
      1 GET chaxun/job/_search
      2 {
      3   "stored_fields": ["title", "company_name"],
      4   "query": {
      5     "match": {
      6       "title": "python"
      7     }
      8   }
      9 }
      1 s = Search(index='chaxun').query('match', title='python').source(['title', 'company_name'])
      2 response = s.execute()
    • 通過sort對結果排序
       1 GET chaxun/job/_search
       2 {
       3   "query": {
       4     "match_all": {}
       5   },
       6   "sort": [
       7     {
       8       "comments": {
       9         "order": "desc"
      10       }
      11     }
      12   ]
      13 }
      1 s = Search(index='chaxun').query('match_all').sort({"comments": {"order": "desc"}})
      2 response = s.execute()
    • range查詢范圍
       1 GET chaxun/job/_search
       2 {
       3   "query": {
       4     "range": {
       5       "comments": {
       6         "gte": 10,
       7         "lte": 50,
       8         "boost": 2.0   --權重
       9       }
      10     }
      11   }
      12 }
      1 s = Search(index='chaxun').query('range', comments={"gte": 10, "lte": 50, "boost": 2.0})
      2 response = s.execute()
    • wildcard查詢
       1 GET chaxun/job/_search
       2 {
       3   "query": {
       4     "wildcard": {
       5       "title": {
       6         "value": "pyth*n",
       7         "boost": 2
       8       }
       9     }
      10   }
      11 }
      1 s = Search(index='chaxun').query('wildcard', title={"value": "pyth*n", "boost": 2})
      2 response = s.execute()

 7.2、組合查詢

    • 新建一張查詢表

    • bool查詢
  • 格式如下
    1 bool:{
    2     "filter":[],
    3     "must":[],
    4     "should":[],
    5     "must_not":[]
    6 }
    • 最簡單的filter查詢
      1 select * from testdb where salary=20
       1 GET bool/testdb/_search
       2 {
       3   "query": {
       4     "bool": {
       5       "must": {
       6         "match_all":{}
       7       },
       8      "filter": {
       9         "term":{
      10           "salary":20
      11         }
      12       }
      13     }
      14   }
      15 }
      1 s = Search(index='bool').query('bool', filter=[Q('term', salary=20)])
      2 response = s.execute()
    • 查看分析器解析(分詞)的結果
      1 GET _analyze
      2 {
      3   "analyzer": "ik_max_word",
      4   "text": "成都電子科技大學"
      5 }

      注釋:”ik_max_word”,精細分詞;”ik_smart”,粗略分詞

    • bool組合過濾查詢
      1 select * from testdb where (salary=20 or title=python) and (salary !=30)
       1 GET bool/testdb/_search
       2 {
       3   "query": {
       4     "bool": {
       5       "should": [
       6         {"term":{"salary":20}},
       7         {"term":{"title":"python"}}
       8       ],
       9       "must_not": [
      10         {"term":{"salary":30}}
      11       ]
      12     }
      13   }
      14 }
      1 q = Q('bool', should=[Q('term', salary=20), Q('term', title='python')],must_not=[Q('term', salary=30)])
      2 response = s.execute()
    • 嵌套查詢
      1 select * from testdb where title=python or (title=django and salary=30)
       1 GET bool/testdb/_search
       2 {
       3   "query": {
       4     "bool":{
       5       "should":[
       6         {"term":{"title":"python"}},
       7         {"bool":{
       8           "must":[{"term":{"title":"django"}},
       9                   {"term":{"salary":30}}]
      10         }}
      11       ]
      12     }
      13   }
      14 }
      1 q = Q('bool', should=[Q('term', title='python'), Q('bool', must=[Q('term', title='django'), Q('term', salary=30)])])
      2 s = Search(index='bool').query(q)
      3 response = s.execute()
    • 過濾空和非空
  • 建立測試數據
     1 POST null/testdb2/_bulk
     2 {"index":{"_id":1}}
     3 {"tags":["search"]}
     4 {"index":{"_id":2}}
     5 {"tags":["search", "python"]}
     6 {"index":{"_id":3}}
     7 {"other_field":["some data"]}
     8 {"index":{"_id":4}}
     9 {"tags":null}
    10 {"index":{"_id":5}}
    11 {"tags":["search", null]}
  • 處理null空值的方法
    1 select tags from testdb2 where tags is not NULL
     1 GET null/testdb2/_search
     2 {
     3   "query": {
     4     "bool":{
     5       "filter": {
     6         "exists": {
     7           "field": "tags"
     8         }
     9       }
    10     }
    11   }
    12 }
    1 s = Search(index='null').query('bool', filter={"exists": {"field": "tags"}})
    2 response = s.execute()

7.3、聚合查詢

未完待續...

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM