Elasticsearch實踐（一）：基礎入門

本文轉載自查看原文 2018-10-21 19:27 988 Elasticsearch/ ELK

本文以 Elasticsearch 6.2.4為例。

注：最新（截止到2018-09-23）的 Elasticsearch 是 6.4.1。5.x系列和6.x系列雖然有些區別，但基本用法是一樣的。

官方文檔：
https://www.elastic.co/guide/en/elasticsearch/reference/6.2/

安裝

安裝比較簡單。分兩步：

配置JDK環境
安裝Elasticsearch

Elasticsearch 依賴 JDK環境，需要系統先下載安裝 JDK 並配置 JAVA_HOME 環境變量。JDK 版本推薦：1.8.0系列。地址：https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

安裝JDk

Linux:

$ yum install -y java-1.8.0-openjdk

配置環境變量，需要修改/etc/profile，增加：

JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el6_10.x86_64
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
JAVACMD=/usr/bin/java
export JAVA_HOME JAVACMD CLASSPATH PATH

然后使之生效：

source /etc/profile

Windows:

安裝包地址：
http://download.oracle.com/otn-pub/java/jdk/8u191-b12/2787e4a523244c269598db4e85c51e0c/jdk-8u191-windows-x64.exe

下載並配置JDK環境變量

JAVA_HOME=C:\Program Files\Java\jdk1.8.0_101

CLASSPATH=.;%JAVA_HOME%\lib;.;%JAVA_HOME%\lib\dt.jar;%JAVA_HOME%\lib\tools.jar;

安裝Elasticsearch

Elasticsearch 安裝只需要下載二進制壓縮包包，解壓即可使用。需要特別注意的是版本號，如果還要安裝Kibana及插件，需要注意選用一樣的版本號。

安裝包下載：https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.4.tar.gz

這個頁面有 Elasticsearch 所有版本的下載：https://www.elastic.co/downloads/past-releases

下載后解壓到指定目錄，進入到 bin 目錄，就可以運行 Elasticsearch 了：
Linux:

./elasticsearch

Windows:

elasticsearch.bat

Windows也可以安裝為系統服務：

D:\work\elk\elasticsearch-6.2.4\bin>elasticsearch-service.bat
Usage: elasticsearch-service.bat install|remove|start|stop|manager [SERVICE_ID]

elasticsearch-service.bat install
elasticsearch-service.bat start

elasticsearch-service.bat stop
elasticsearch-service.bat remove

瀏覽器訪問：http://127.0.0.1:9200，如果返回version等信息，說明安裝成功。

注： Linux/Mac環境不能使用 root 用戶運行。

Dev Tools

我們可以使用curl或者kibana提供的Dev Tools進行API測試。

例如：
curl方式：

curl 'localhost:9200/_cat/health?format=json'

[{"epoch":"1537689647","timestamp":"16:00:47","cluster":"elasticsearch","status":"yellow","node.total":"1","node.data":"1","shards":"11","pri":"11","relo":"0","init":"0","unassign":"11","pending_tasks":"0","max_task_wait_time":"-","active_shards_percent":"50.0%"}]

Dev Tools:

GET /_cat/health?format=json

個人比較喜歡Kibana提供的Dev Tools，非常方便。如果沒有安裝，參考下面安裝：

a. 下載kibana Windows版：
https://artifacts.elastic.co/downloads/kibana/kibana-6.2.4-windows-x86_64.zip

b. 解壓后進kibana-6.2.4-windows-x86_64\bin目錄，運行kibana.bat即可:

D:\work\elk\kibana-6.2.4-windows-x86_64\bin>kibana.bat
  log   [02:52:17.243] [info][status][plugin:kibana@6.2.4] Status changed from uninitialized to gree
n - Ready
  log   [02:52:17.869] [info][status][plugin:elasticsearch@6.2.4] Status changed from uninitialized
to yellow - Waiting for Elasticsearch
  log   [02:52:17.880] [info][status][plugin:console@6.2.4] Status changed from uninitialized to gre
en - Ready
  log   [02:52:17.888] [info][status][plugin:metrics@6.2.4] Status changed from uninitialized to gre
en - Ready
  log   [02:52:18.165] [info][status][plugin:timelion@6.2.4] Status changed from uninitialized to gr
een - Ready
  log   [02:52:18.200] [info][listening] Server running at http://localhost:5601
  log   [02:52:18.268] [info][status][plugin:elasticsearch@6.2.4] Status changed from yellow to gree
n - Ready

c. 瀏覽器訪問: http://127.0.0.1:5601

查看_cat命令：

GET _cat

=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates

以下測試均在Dev Tools執行。

節點操作

查看健康狀態

GET /_cat/health?format=json

format=json 表示輸出json格式，默認是文本格式。

結果：

[
  {
    "epoch": "1537689915",
    "timestamp": "16:05:15",
    "cluster": "elasticsearch",
    "status": "yellow",
    "node.total": "1",
    "node.data": "1",
    "shards": "11",
    "pri": "11",
    "relo": "0",
    "init": "0",
    "unassign": "11",
    "pending_tasks": "0",
    "max_task_wait_time": "-",
    "active_shards_percent": "50.0%"
  }
]

健康狀態有3種：

Green - 正常（集群功能齊全）
Yellow - 所有數據均可用，但尚未分配一些副本（群集功能齊全）
Red - 某些數據由於某種原因不可用（群集部分功能可用）

注意：當群集為紅色時，它將繼續提供來自可用分片的搜索請求，但您可能需要盡快修復它，因為存在未分配的分片。

查看節點

GET /_cat/nodes?format=json

索引

創建index

PUT /customer

輸出：

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "customer"
}

注：實際項目里一般是不會直接這樣創建 index 的，這里僅為演示。一般都是通過創建 mapping 手動定義 index 或者自動生成 index 。

查看所有index

GET /_cat/indices?format=json

結果：

[
  {
    "health": "yellow",
    "status": "open",
    "index": "customer",
    "uuid": "AC4WMuViTguHDFtCRlXLow",
    "pri": "5",
    "rep": "1",
    "docs.count": "0",
    "docs.deleted": "0",
    "store.size": "1.1kb",
    "pri.store.size": "1.1kb"
  }
]

刪除index

DELETE /customer

輸出：

{
  "acknowledged": true
}

注：刪除索引會把數據一並刪除。實際操作請謹慎。

簡單的增刪改查

本文只講解簡單的增刪改查。

ES文檔有一些缺省字段，稱之為Meta-Fields，例如_index、_type、_id等，查詢文檔的時候會返回。

按ID新增數據

type為doc：

PUT /customer/doc/1
{
  "name": "John Doe"
}

對應命令行：

curl -XPUT http://127.0.0.1:9200/customer/doc/1 -H "Content-Type: application/json" -d '{"name": "John Doe"}'

PUT /customer/doc/2
{
  "name": "yujc",
  "age":22
}

如果索引index不存在，直接新增數據也會同時創建index。

同時，該操作也能修改數據：

PUT /customer/doc/2
{
  "name": "yujc2"
}

name字段會被修改，而且_version會被修改為2。該操作實際是覆蓋數據：

GET /customer/doc/2

對應命令行：

curl -XGET http://127.0.0.1:9200/customer/doc/2

結果：

{
  "_index": "customer",
  "_type": "doc",
  "_id": "2",
  "_version": 2,
  "found": true,
  "_source": {
    "name": "yujc2"
  }
}

按ID查詢數據

GET /customer/doc/1

對應命令行：

curl -XGET http://127.0.0.1:9200/customer/doc/1

結果：

{
  "_index": "customer",
  "_type": "doc",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "John Doe"
  }
}

直接新增數據

我們也可以不指定文檔ID從而直接新增數據：

POST /customer/doc
{
  "name": "yujc",
  "age":23
}

注意這里使用的動作是POST。PUT新增數據必須指定文檔ID。

按ID更新數據

我們使用下面兩種方式均能更新已有數據：

PUT /customer/doc/1
{
  "name": "yujc2",
  "age":22
}

POST /customer/doc/1
{
  "name": "yujc2",
  "age":22
}

以上操作均會覆蓋現有數據。

更新部分字段(`_update`)

如果只是想更新指定字段，必須使用POST加參數的形式：

POST /customer/doc/1/_update
{
  "doc":{"name": "yujc"}
}

其中_update表示更新。json里doc必須有，否則會報錯。

對應命令行：

curl -XPOST http://127.0.0.1:9200/customer/doc/1/_update -H "Content-Type: application/json" -d '{"doc":{"name": "yujc"}}'

增加字段：

POST /customer/doc/1/_update
{
  "doc":{"year": 2018}
}

就會在已有的數據基礎上增加一個year字段，不會覆蓋已有數據：

GET /customer/doc/1

結果：

{
  "_index": "customer",
  "_type": "doc",
  "_id": "1",
  "_version": 16,
  "found": true,
  "_source": {
    "name": "yujc",
    "age": 22,
    "year": 2018
  }
}

也可以使用簡單腳本執行更新。此示例使用腳本將年齡增加5：

POST /customer/doc/1/_update
{
  "script":"ctx._source.age+=5"
}

結果：

{
  "_index": "customer",
  "_type": "doc",
  "_id": "1",
  "_version": 17,
  "found": true,
  "_source": {
    "name": "yujc",
    "age": 27,
    "year": 2018
  }
}

按ID刪除數據

DELETE /customer/doc/1

查詢mapping

GET /customer/_mapping

輸出：

{
  "customer": {
    "mappings": {
      "doc": {
        "properties": {
          "age": {
            "type": "long"
          },
          "name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

說明：properties表示字段，這里一共有2個字段（ES自動創建的）：

age，類型是long（支持檢索）
name，類型是text（支持檢索、分詞）；且額外增加了一個字段name.keyword，類型是keyword（支持檢索）。

以上具體到后面講解。

拓展知識：

注：ElasticSearch里面有 index 和 type 的概念：index稱為索引,type為文檔類型，一個index下面有多個type，每個type的字段可以不一樣。這類似於關系型數據庫的 database 和 table 的概念。但是，ES中不同type下名稱相同的filed最終在Lucene中的處理方式是一樣的。所以后來ElasticSearch團隊想去掉type，於是在6.x版本為了向下兼容，一個index只允許有一個type。預計7.x版本徹底去掉type。參考：https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html

所以，實際使用中建議一個index里面僅有一個type，名稱可以和index一致，或者使用固定的doc。

批量接口

批量創建

POST /customer/doc/_bulk
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }

該操作會新增2條記錄，其中文檔第1行和第3行提供的是要操作的文檔id，第2行和第4行是相應的源文檔，即數據內容。這里對文檔的操作是index，也可以是create，二者都是創建文檔，只是如果文檔已存在，index會覆蓋，create會失敗。

查詢數據：

GET /customer/doc/2

結果：

{
  "_index": "customer",
  "_type": "doc",
  "_id": "2",
  "_version": 2,
  "found": true,
  "_source": {
    "name": "Jane Doe"
  }
}

批量更新、刪除

POST /customer/doc/_bulk
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}

該操作會更新ID為1的文檔，刪除ID為2的文檔。對於刪除操作，之后沒有相應的源文檔，因為刪除只需要刪除文檔的ID。

注意：批量操作如果某條失敗了，並不影響下一條繼續執行。

按條件更新

curl -X POST http://127.0.0.1:9200/test/doc/_update_by_query -H "Content-Type: application/json" -d '{"script":{"source":"ctx._source[\"is_pub\"]=1"},"query":{"match_all":{}}}'

這個示例的含義是將文檔test/doc的所有文檔的is_pub字段設置為1。

按條件刪除

curl -X POST http://127.0.0.1:9200/test/doc/_delete_by_query -H "Content-Type: application/json" -d '{"query":{"bool":{"filter":{"range":{"id":{"gt":1661208}}}}}}'

這個示例的含義是將文檔test/doc里字段 id 符合id>1661208的全部刪除。

防盜版聲明：本文系原創文章，發布於公眾號飛鴻影的博客(fhyblog)及博客園，轉載需作者同意。

參考

1、Getting Started | Elasticsearch Reference [6.2] | Elastic
https://www.elastic.co/guide/en/elasticsearch/reference/6.2/getting-started.html
2、Elasticsearch 5.x 關於term query和match query的認識 - wangchuanfu - 博客園
https://www.cnblogs.com/wangchuanfu/p/7444253.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Elasticsearch入門實踐 Elasticsearch基礎入門 Elasticsearch（1）：基礎入門 Elasticsearch 學習一（基礎入門）. ElasticSearch 連載一基礎入門 ELK實踐（一）：基礎入門 ElasticSearch基礎入門學習筆記 Elasticsearch 技術分析（一）：基礎入門 Elasticsearch基礎入門，詳情可見官方文檔 [轉]遠控免殺從入門到實踐（一）：基礎篇