elasticsearch入門使用（二） Mapping + field type字段類型

本文轉載自查看原文 2018-03-12 17:51 1693 elasticsearch

Elasticsearch Reference [6.2] » Mapping
參考官方英文文檔 https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

部分內容參考：https://www.cnblogs.com/ljhdo/p/4981928.html

Mapping 是定義文檔及其包含的字段如何存儲和編制索引的過程，每個索引都有一個映射類型，用於確定文檔將如何編制索引。

一、Meta-fields
包括文檔的_index，_type，_id和_source字段

二、es字段數據類型：
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html

字符串類型
text 、 keyword
數值類型
long, integer, short, byte, double, float, half_float, scaled_float
日期類型
date
布爾值類型
boolean
二進制類型
binary
范圍類型
integer_range, float_range, long_range, double_range, date_range
Array數據類型(Array不需要定義特殊類型)

[ "one", "two" ]
[ 1, 2 ]
[{ "name": "Mary", "age": 12 },{ "name": "John", "age": 10}]

Object數據類型（json嵌套）

{ 
  "region": "US",
  "manager": { 
    "age":     30,
    "name": { 
      "first": "John",
      "last":  "Smith"
    }
  }
}

地理數據類型
Geo-point，Geo-Shape(比較復雜，參考官網文檔，一般用Geo-point就可以了)
特殊數據類型
ip(IPv4 and IPv6 addresses)
completion(自動完成/搜索)
token_count (數值類型，分析字符串，索引的數量)
murmur3 (索引時計算字段值的散列並將它們存儲在索引中的功能。在高基數和大字符串字段上運行基數聚合時有很大幫助)
join (同一索引的文檔中創建父/子關系)

以下是常用的參數類型定義&賦值demo

類型	參數定義	賦值
text	"name":{"type":"text"}	"name": "zhangsan"
keyword	"tags":{"type":"keyword"}	"tags": "food"
date	"date":{"type": "date"}	"date":"2015-01-01T12:10:30"
long	"age":{"type":"long"}	"age" :28
double	"score":{"type":"double"}	"score":98.8
boolean	"isgirl": { "type": "boolean" }	"isgirl" :true
ip	"ip_addr":{"type":"ip"}	"ip_addr": "192.168.1.1"
geo_point	"location": {"type":"geo_point"}	"location":{"lat":40.12,"lon":-71.34}

三、Mapping parameters

https://www.elastic.co/guide/en/elasticsearch/reference/6.2/mapping-params.html 帶*是常用的字段屬性

Parameters	默認值	備注
*analyzer	"standard"	standard/simple/stop/keyword/whitespace/(lang:english)字符串分析器,keyword意思是不分詞內容整體作為一個token
normalizer	-	統一設置標准化分詞，mapping里的字段可以使用同樣的分詞器
boost	1.0	字段在文檔中的權重
coerce	true	字符串強制轉換為數字
copy_to	-	例如將firstname和lastname復制到fullname
doc_values	true	創建索引的時候存儲在磁盤的數據結構，不需要排序和聚合改為false節省磁盤空間
*dynamic	true	ture/false/strict允許動態添加字段，不建議設為true
enabled	true	只存儲不索引或聚合，例如session會話存儲
fielddata		字符串專用，查詢時將term-document關系存儲在內存中
eager_global_ordinals		自增唯一編號
*format		"format": "yyyy-MM-dd hh:mm:ss"
ignore_above	0	int,超過這個長度的字符串不會被索引和存儲,0代表不限制
ignore_malformed		設置為true允許錯誤的數據類型索引到字段中引起的異常
index_options	positions/docs	docs(只索引文檔編號)/freqs(索引文檔編號和詞頻)/positions(索引文檔編號/詞頻/詞位置)/offsets(索引文檔編號/詞頻/詞偏移量/詞位置) ,被索引的字段默認用positions，其他的docs
*index	"analyzed"	analyzed/not_analyzed/no 字段值是否被索引,設置no的字段不可查詢，參考中文文檔
fields		相同的字段設置不同的方式
norms	true	score評分相關，會占用一定的磁盤空間，不需要可以關閉
null_value	null	空值不能被索引和搜索，用字符串"NULL"代替空值 "null_value": "NULL"
position_increment_gap	100	當索引多個值的文本時支持臨近或短語查詢，或值之間的間隙
properties	-	在創建索引時定義字段的屬性
*search_analyzer	索引的分詞器	一般索引和搜索用同樣的分詞器，如需不一樣可更改
similarity	"BM25"	BM25/classic/boolean，主要用於文本字段的相似度算法
*store	false	默認情況字段被索引可以搜索，但沒有存儲原始值且不能用原始值查詢，_resource包含了所有的值，當大段文本需要搜索時可以修改為true
term_vector	"no"	no/yes/with_positions/with_offsets/with_positions_offsets 分詞向量，分析過程產生的術語

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 elasticsearch入門使用 Mapping + field type字段類型 ElasticSearch Mapping中的字段類型 elasticsearch移除映射類型(mapping type) ElasticSearch（五）：Mapping和常見字段類型 Arcpy里莫名其妙的字段類型（Field type） elasticsearch文檔-字段的mapping elasticsearch在已有mapping添加字段 Elasticsearch從入門到放棄：瞎說Mapping 使用elasticsearch啟動項目報錯failed to load elasticsearch nodes 。。。。。No type specified for field [name] ElasticSearch Field數據類型