【原創】大數據基礎之ElasticSearch(2)常用API整理


Fortunately, Elasticsearch provides a very comprehensive and powerful REST API that you can use to interact with your cluster. Among the few things that can be done with the API are as follows:

  • Check your cluster, node, and index health, status, and statistics
  • Administer your cluster, node, and index data and metadata
  • Perform CRUD (Create, Read, Update, and Delete) and search operations against your indexes
  • Execute advanced search operations such as paging, sorting, filtering, scripting, aggregations, and many others

es提供了一套容易理解並且強大的rest api接口,通過該接口你可以和集群進行交互,完成各種操作:檢查集群狀態、管理集群、對索引做CRUD操作、查詢索引;

 

Rest API Pattern:

<REST Verb> /<Index>/<Type>/<ID>[?pretty|v]

ps:Type相當於Category或者Partition的概念;Type未來會被廢棄掉;

The pretty parameter, again, just tells Elasticsearch to return pretty-printed JSON results.

所有返回json接口都可以增加pretty參數,這樣返回的json是格式化的;

Each of the commands accepts a query string parameter v to turn on verbose output.

v參數意味着詳細輸出;

 

以下通過CURL請求,關於CURL詳見:https://www.cnblogs.com/barneywill/p/10279555.html

一  集群相關

1 查看健康情況

# curl http://$es_server:9200/_cat/health?v
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1547990539 21:22:19 elasticsearch green 3 3 10 5 0 0 0 0 - 100.0%

2 查看節點

# curl http://$es_server:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
server1 29 74 1 0.07 0.10 0.13 mdi * 3iLMMxu
server2 45 74 1 0.11 0.11 0.13 mdi - vz1k1MB
server3 47 75 1 0.08 0.07 0.08 mdi - vGUu-b6

3 查看master

# curl 'http://$es_server:9200/_cat/master?v'
id host ip node
3iLMMxuCTISHPJaVo6I4SA server1 server1 3iLMMxu

4 查看所有索引

# curl http://$es_server:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open testdoc GFZhtn6GSMy2pPPj8UK70Q 5 1 1 0 8.9kb 4.4kb

5 查看節點狀態

# curl -XGET 'http://localhost:9200/_nodes/stats?pretty'

二 索引相關

1 建立新索引

# curl -XPUT 'http://$es_server:9200/testdoc/'
{"acknowledged":true,"shards_acknowledged":true,"index":"testdoc"}

2 刪除索引

# curl -XDELETE 'http://$es_server:9200/testdoc/'

3 查看shards

# curl http://localhost:9200/_cat/shards

三 文檔相關

1 插入單個文檔

# curl -XPUT 'http://localhost:9200/testdoc/testtype/1' -d '{"name":"test"}'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}

如果報錯:

{"error":"Incorrect HTTP method for uri [/testdoc/testtype] and method [PUT], allowed: [POST]","status":405}

添加header

-H 'Content-Type: application/json'

2 查詢單個文檔

# curl -XGET 'http://$es_server:9200/testdoc/testtype/1'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":1,"found":true,"_source":{"name":"test"}}

3 修改單個文檔

1)使用相同的id和不同的數據再調用一次

# curl -XPUT 'http://$es_server:9200/testdoc/testtype/1' -d '{"name":"test hello"}'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":2}

2)通過update

# curl -XPOST 'http://$es_server:9200/testdoc/testtype/1/_update' -d '{"doc":{"name":"test hello again"}}'
{"_index":"testdoc","_type":"testtype","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":2,"_primary_term":2}

4 刪除單個文檔

# curl -XDELETE 'http://$es_server:9200/testdoc/testtype/1'

5 批量文檔操作接口

同時進行兩個插入一個修改一個刪除

# curl -XPOST 'http://$es_server:9200/testdoc/testtype/1/_bulk' -d '
{"index":{"_id":"3"}}
{"name": "John Doe" }
{"index":{"_id":"4"}}
{"name": "Jane Doe" }
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}'

6 查詢所有文檔

以下兩種請求等價

# curl -XGET 'http://$es_server:9200/testdoc/_search?q=*'
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"testdoc","_type":"testtype","_id":"1","_score":1.0,"_source":{"name":"test hello again"}}]}}

# curl -XPOST 'http://$es_server:9200/testdoc/_search' -d '{"query":{"match_all":{}}}'

7 查詢count總數

# curl http://localhost:9200/testdoc/_count

8 通過條件查詢count

# curl http://localhost:9200/testdoc/_count?q=name:hello

# curl http://localhost:9200/testdoc/_count?q=name:hello%20AND%20age:10

注意url傳遞query時如果有多個field,需要使用AND或OR連接,同時空格替換為編碼%20

9 sql查詢

# curl -XPOST -H 'Content-Type: application/json' 'http://$es_server:9200/_xpack/sql?format=txt' -d '{"query":"select * from testdoc"}'
name
----------------
test hello again

四 Setting相關

1 查看一個索引的setting

# curl -XGET 'http://localhost:9200/testdoc/_settings'

2 查看所有setting

# curl -XGET 'http://localhost:9200/_all/_settings'

五 Mapping相關

Mapping(索引結構定義)類似於表結構定義,定義所有的字段、數據類型、是否存儲、是否索引、analyzer等;

Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. For instance, use mappings to define:

  • which string fields should be treated as full text fields.
  • which fields contain numbers, dates, or geolocations.
  • whether the values of all fields in the document should be indexed into the catch-all _all field.
  • the format of date values.
  • custom rules to control the mapping for dynamically added fields.

1 查看單個索引的mapping

# curl http://localhost:9200/testdoc/_mapping/testtype
{"testdoc":{"mappings":{"testtype":{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}}}

2 查看所有的索引的mapping

# curl http://localhost:9200/_mapping
# curl http://localhost:9200/_all/_mapping

3 在已有mapping上添加字段

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/testdoc/_mapping/testtype -d '
{
  "properties": {
    "email": {
      "type": "keyword"
    }
  }
}'

4 設置mapping

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/testdoc -d '
{
  "mappings": {
    "testtype": { 
      "properties": { 
        "title":    { "type": "text", "analyzer": "standard"}, 
        "name":     { "type": "text"  }, 
        "age":      { "type": "integer" },  
        "created":  {
          "type":   "date", 
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    }
  }
}'

 

5 更新mapping

mapping無法更新,只能使用新的mapping創建新的索引,然后重建索引來間接實現mapping更新;

Other than where documented, existing field mappings cannot be updated. Changing the mapping would mean invalidating already indexed documents. Instead, you should create a new index with the correct mappings and reindex your data into that index. If you only wish to rename a field and not change its mappings, it may make sense to introduce an alias field.

參考:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

六 Analyzer相關

Analysis is the process of converting text, like the body of any email, into tokens or terms which are added to the inverted index for searching. Analysis is performed by an analyzer which can be either a built-in analyzer or a custom analyzer defined per index.

analyzer在mapping中配置,比如

"title": { "type": "text", "analyzer": "standard"}, \

測試analyzer

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"standard","filter":  [ "lowercase", "asciifolding" ],"text":      "Is this chandler?"}'
{
  "tokens" : [
    {
      "token" : "is",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "this",
      "start_offset" : 3,
      "end_offset" : 7,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "chandler",
      "start_offset" : 8,
      "end_offset" : 16,
      "type" : "<ALPHANUM>",
      "position" : 2
    }
  ]
}
# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"standard","text":"聯想是全球最大的筆記本廠商"}'
{
  "tokens" : [
    {
      "token" : "",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "<IDEOGRAPHIC>",
      "position" : 2
    },
    {
      "token" : "",
      "start_offset" : 3,
      "end_offset" : 4,
      "type" : "<IDEOGRAPHIC>",
      "position" : 3
    },
    {
      "token" : "",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "<IDEOGRAPHIC>",
      "position" : 4
    },
    {
      "token" : "",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "<IDEOGRAPHIC>",
      "position" : 5
    },
    {
      "token" : "",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "<IDEOGRAPHIC>",
      "position" : 6
    },
    {
      "token" : "",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "<IDEOGRAPHIC>",
      "position" : 7
    },
    {
      "token" : "",
      "start_offset" : 8,
      "end_offset" : 9,
      "type" : "<IDEOGRAPHIC>",
      "position" : 8
    },
    {
      "token" : "",
      "start_offset" : 9,
      "end_offset" : 10,
      "type" : "<IDEOGRAPHIC>",
      "position" : 9
    },
    {
      "token" : "",
      "start_offset" : 10,
      "end_offset" : 11,
      "type" : "<IDEOGRAPHIC>",
      "position" : 10
    },
    {
      "token" : "",
      "start_offset" : 11,
      "end_offset" : 12,
      "type" : "<IDEOGRAPHIC>",
      "position" : 11
    },
    {
      "token" : "",
      "start_offset" : 12,
      "end_offset" : 13,
      "type" : "<IDEOGRAPHIC>",
      "position" : 12
    }
  ]
}

 

中文分詞smarkcn

$ bin/elasticsearch-plugin install analysis-smartcn

This plugin can be downloaded for offline install from https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-smartcn/analysis-smartcn-6.6.2.zip.

The plugin provides the smartcn analyzer and smartcn_tokenizer tokenizer, which are not configurable.

 

分詞效果:

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"smartcn_tokenizer","text":"聯想是全球最大的筆記本廠商"}'
{
  "tokens" : [
    {
      "token" : "聯想",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "全球",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "word",
      "position" : 2
    },
    {
      "token" : "",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "word",
      "position" : 3
    },
    {
      "token" : "",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "word",
      "position" : 4
    },
    {
      "token" : "",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "word",
      "position" : 5
    },
    {
      "token" : "筆記本",
      "start_offset" : 8,
      "end_offset" : 11,
      "type" : "word",
      "position" : 6
    },
    {
      "token" : "廠商",
      "start_offset" : 11,
      "end_offset" : 13,
      "type" : "word",
      "position" : 7
    }
  ]
}

參考:https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-smartcn.html

中文分詞ik

$ bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.6.2/elasticsearch-analysis-ik-6.6.2.zip

The IK Analysis plugin integrates Lucene IK analyzer (http://code.google.com/p/ik-analyzer/) into elasticsearch, support customized dictionary.
Analyzer: ik_smart , ik_max_word , Tokenizer: ik_smart , ik_max_word

 

分詞效果:

ik_smark

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"ik_smart","text":"聯想是全球最大的筆記本廠商"}'         
{
  "tokens" : [
    {
      "token" : "聯想",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "全球",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "最大",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_CHAR",
      "position" : 4
    },
    {
      "token" : "筆記本",
      "start_offset" : 8,
      "end_offset" : 11,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "廠商",
      "start_offset" : 11,
      "end_offset" : 13,
      "type" : "CN_WORD",
      "position" : 6
    }
  ]
}

ik_max_word

# curl -XPOST -H 'Content-Type: application/json' http://localhost:9200/_analyze?pretty -d '{"tokenizer":"ik_max_word","text":"聯想是全球最大的筆記本廠商"}'
{
  "tokens" : [
    {
      "token" : "聯想",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "全球",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "最大",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_CHAR",
      "position" : 4
    },
    {
      "token" : "筆記本",
      "start_offset" : 8,
      "end_offset" : 11,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "筆記",
      "start_offset" : 8,
      "end_offset" : 10,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "本廠",
      "start_offset" : 10,
      "end_offset" : 12,
      "type" : "CN_WORD",
      "position" : 7
    },
    {
      "token" : "廠商",
      "start_offset" : 11,
      "end_offset" : 13,
      "type" : "CN_WORD",
      "position" : 8
    }
  ]
}

參考:https://github.com/medcl/elasticsearch-analysis-ik

七 復雜查詢

1 查詢接口主要參數

q
The query string.

stored_fields
The selective stored fields of the document to return for each hit, comma delimited. Not specifying any value will cause no fields to return.

sort
Sorting to perform. Can either be in the form of fieldName, or fieldName:asc/fieldName:desc. The fieldName can either be an actual field within the document, or the special _score name to indicate sorting based on scores. There can be several sort parameters (order is important).

from
The starting from index of the hits to return. Defaults to 0.

size
The number of hits to return. Defaults to 10.

timeout
A search timeout, bounding the search request to be executed within the specified time value and bail with the hits accumulated up to that point when expired. Defaults to no timeout.

default_operator
The default operator to be used, can be AND or OR. Defaults to OR.

其中sort有很多種實現,比如 _geo_distance 可以用來實現地理位置遠近排序,另外還可以通過filter來實現地理位置圈定,詳見:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-distance-query.html

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM