到elasticsearch網站下載最新版本的elasticsearch 6.2.1
https://www.elastic.co/downloads/elasticsearch
其他版本
https://www.elastic.co/cn/downloads/past-releases/elasticsearch-6-4-2
嫌棄官方下載速度慢的可以去華為的鏡像站去
https://mirrors.huaweicloud.com/elasticsearch/6.4.2/
中文文檔請參考
https://www.elastic.co/guide/cn/elasticsearch/guide/current/index.html
英文文檔及其Java API使用方法請參考,官方文檔比任何博客都可信
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html
Python API使用方法
http://elasticsearch-py.readthedocs.io/en/master/
下載tar包,然后解壓到/usr/local目錄下,修改一下用戶和組之后可以使用非root用戶啟動,啟動命令
./bin/elasticsearch
然后訪問http://127.0.0.1:9200/

如果需要讓外網訪問Elasticsearch的9200端口的話,需要將es的host綁定到外網
修改 /configs/elasticsearch.yml文件,添加如下
network.host: 0.0.0.0 http.port: 9200
然后重啟,如果遇到下面問題的話
[2018-01-28T23:51:35,204][INFO ][o.e.b.BootstrapChecks ] [qR5cyzh] bound or publishing to a non-loopback address, enforcing bootstrap checks ERROR: [2] bootstrap checks failed [1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536] [2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
解決方法
第一個ERROR,
在文件中添加 sudo vim /etc/security/limits.conf,然后重新登錄
* soft nofile 65536 * hard nofile 131072 * soft nproc 2048 * hard nproc 4096
如果你是用supervisor啟動es的話,需要修改文件 vim /etc/supervisor/supervisord.conf,然后重啟supervisor
[supervisord] minfds=65536
第二個ERROR,在root用戶下執行
臨時解決
sysctl -w vm.max_map_count=262144
永久解決
cat /proc/sys/vm/max_map_count sudo vim /etc/sysctl.conf
添加
vm.max_map_count=262144
然后使其生效
sysctl -p
接下來導入json格式的數據,數據內容如下
{"index":{"_id":"1"}}
{"title":"許寶江","url":"7254863","chineseName":"許寶江","sex":"男","occupation":" 灤縣農業局局長","nationality":"中國"}
{"index":{"_id":"2"}}
{"title":"鮑志成","url":"2074015","chineseName":"鮑志成","occupation":"醫師","nationality":"中國","birthDate":"1901年","deathDate":"1973年","graduatedFrom":"香港大學"}
需要注意的是{"index":{"_id":"1"}}和文件末尾另起一行換行是不可少的
其中的id可以從0開始,甚至是abc等等
否則會出現400狀態,錯誤提示分別為
Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]
The bulk request must be terminated by a newline [\n]"
使用下面命令來導入json文件
其中的people.json為文件的路徑,可以是/home/common/下載/xxx.json
其中的es是index,people是type,在elasticsearch中的index和type可以理解成關系數據庫中的database和table,兩者都是必不可少的
curl -H "Content-Type: application/json" -XPOST 'localhost:9200/es/people/_bulk?pretty&refresh' --data-binary "@people.json"
成功后的返回值是200,比如
{
"took" : 233,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "es",
"_type" : "people",
"_id" : "1",
"_version" : 1,
"result" : "created",
"forced_refresh" : true,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "es",
"_type" : "people",
"_id" : "2",
"_version" : 1,
"result" : "created",
"forced_refresh" : true,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
}
]
}
<0>查看字段的mapping
http://localhost:9200/es/people/_mapping

接下來可以使用對應的查詢語句對數據進行查詢
<1>按id來查詢
http://localhost:9200/es/people/1

<2>簡單的匹配查詢,查詢某個字段中包含某個關鍵字的數據(GET)
http://localhost:9200/es/people/_search?q=_id:1
http://localhost:9200/es/people/_search?q=title:許

<3>多字段查詢,在多個字段中查詢包含某個關鍵字的數據(POST)
可以使用Firefox中的RESTer插件來構造一個POST請求,在升級到Firefox quantum之后,原來使用的Poster插件掛了
在title和sex字段中查詢包含 許 字的數據
{
"query": {
"multi_match" : {
"query" : "許",
"fields": ["title", "sex"]
}
}
}


還可以額外指定返回值
size指定返回的數量
from指定返回的id起始值
_source指定返回的字段
highlight指定語法高亮
{
"query": {
"multi_match" : {
"query" : "中國",
"fields": ["nationality", "sex"]
}
},
"size": 2,
"from": 0,
"_source": [ "title", "sex", "nationality" ],
"highlight": {
"fields" : {
"title" : {}
}
}
}
<4>Boosting
用於提升字段的權重,可以將max_score的分數乘以一個系數
{
"query": {
"multi_match" : {
"query" : "中國",
"fields": ["nationality^3", "sex"]
}
},
"size": 2,
"from": 0,
"_source": [ "title", "sex", "nationality" ],
"highlight": {
"fields" : {
"title" : {}
}
}
}

<5>組合查詢,可以實現一些比較復雜的查詢
AND -> must
NOT -> must not
OR -> should
{
"query": {
"bool": {
"must": {
"bool" : {
"should": [
{ "match": { "title": "鮑" }},
{ "match": { "title": "許" }} ],
"must": { "match": {"nationality": "中國" }}
}
},
"must_not": { "match": {"sex": "女" }}
}
}
}
<6>模糊(Fuzzy)查詢(POST)
{
"query": {
"multi_match" : {
"query" : "廠長",
"fields": ["title", "sex","occupation"],
"fuzziness": "AUTO"
}
},
"_source": ["title", "sex", "occupation"],
"size": 1
}
通過模糊匹配將 廠長 和 局長 匹配上
AUTO的時候,當query的長度大於5的時候,模糊值指定為2

<7>通配符(Wildcard)查詢(POST)
? 匹配任何字符
* 匹配零個或多個字
{
"query": {
"wildcard" : {
"title" : "*寶"
}
},
"_source": ["title", "sex", "occupation"],
"size": 1
}
<8>正則(Regexp)查詢(POST)
{
"query": {
"regexp" : {
"authors" : "t[a-z]*y"
}
},
"_source": ["title", "sex", "occupation"],
"size": 3
}
<9>短語匹配(Match Phrase)查詢(POST)
短語匹配查詢 要求在請求字符串中的所有查詢項必須都在文檔中存在,文中順序也得和請求字符串一致,且彼此相連。
默認情況下,查詢項之間必須緊密相連,但可以設置 slop 值來指定查詢項之間可以分隔多遠的距離,結果仍將被當作一次成功的匹配。
{
"query": {
"multi_match" : {
"query" : "許長江",
"fields": ["title", "sex","occupation"],
"type": "phrase"
}
},
"_source": ["title", "sex", "occupation"],
"size": 3
}
注意使用slop的時候距離是累加的,灤農局 和 灤縣農業局 差了2個距離
{
"query": {
"multi_match" : {
"query" : "灤農局",
"fields": ["title", "sex","occupation"],
"type": "phrase",
"slop":2
}
},
"_source": ["title", "sex", "occupation"],
"size": 3
}
<10>短語前綴(Match Phrase Prefix)查詢(POST)
一些比較復雜的DSL
GET index_*/_search
{
"query": {
"bool": {
"must": [{
"range" : {
"publish_date" : {
"gt" : "2014-01-01",
"lt" : "2019-01-07"
}
}
},
{ "multi_match": {
"query": "免費",
"fields":["name1","name2","name3","name4","name5","name6"]
}
},
{ "multi_match": {
"query": "英語",
"fields":["name1","name2","name3","name4","name5","name6"]
}
}
],
"must_not": { "match": {"tags": "" }},
"filter": {
"range": { "count": { "gte": "30" ,"lte": "1000"}}
}
}
},
"aggs": {
"by_tags": {
"terms": { "field": "field1"
},
"aggs": {
"sales": {
"date_histogram": {
"field": "date",
"interval": "day",
"format": "yyyy-MM-dd"
}
}
}
}
},
"_source": [],
"size": 1
}
帶有去重的
GET xxxx_2019-09-10/_search
{
"query": {
"bool": {
"must": [
{
"range" : {
"xxxx" : {
"gt" : "2014-01-01",
"lt" : "2019-01-07"
}
}
},
{
"terms": {
"xxxx": ["xxx","xxx"]
}
},
{
"terms": {
"xxx": ["xxx","xxx"]
}
},
{
"terms": {
"xxx": ["xxx"]
}
},
{
"bool": {
"should": [
{
"range": {
"xxx": { "gte": 1 ,"lte": 2.99 }}
},
{"range": {
"xxx": { "gte": 3.99 ,"lte": 7.99 }}
}
]}},{
"bool": {
"should": [
{
"range": {
"xxx": { "gte": 0 ,"lte": 100 }}
},
{"range": {
"xxx": { "gte": 1000 ,"lte": 10000 }}
}
]}}
],
"must_not": { "match": {"xx": "" }}
}
},
"collapse":{
"field":"xxx"
},
"aggs": {
"by_tags": {
"terms": { "field": "xxx"
},
"aggs": {
"sales": {
"date_histogram": {
"field": "xxx",
"interval": "month",
"format": "yyyy-MM-dd"
}
}
}
}
},
"_source":["xxx"],
"size": 10
}
<11>帶嵌套對象查詢
參考:https://www.elastic.co/guide/cn/elasticsearch/guide/current/nested-query.html
由於嵌套對象 被索引在獨立隱藏的文檔中,我們無法直接查詢它們。 相應地,我們必須使用 nested 查詢 去獲取它們:
對於nested對象的查詢,需要套上一層nested
GET /xxxxx/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "t4",
"query": {
"bool": {
"must": [
{
"match": {
"t4.t1": "HelloWorld"
}
}
]
}
}
}
}
]
}}
}
或者
GET /xxxxx/_search
{
"query": {
"nested": {
"path": "t4",
"query": {
"multi_match" : {
"query" : "HelloWorld",
"fields": ["t4.t1", "sex"]
}
}
}}
}
Es優化:
Elasticsearch 技術分析(七): Elasticsearch 的性能優化
查看索引是否關閉
http://localhost:9200/_cat/indices/index_name?h=status
