elasticsearch-Mapping


什么是映射

類似於數據庫中的表結構定義,主要作用如下:

  • 定義Index下字段名(Field Name)
  • 定義字段的類型,比如數值型,字符串型、布爾型等
  • 定義倒排索引的相關配置,比如是否索引、記錄postion等

需要注意的是,在索引中定義太多字段可能會導致索引膨脹,出現內存不足和難以恢復的情況,下面有幾個設置:

  • index.mapping.total_fields.limit:一個索引中能定義的字段的最大數量,默認是 1000
  • index.mapping.depth.limit:字段的最大深度,以內部對象的數量來計算,默認是20
  • index.mapping.nested_fields.limit:索引中嵌套字段的最大數量,默認是50

Mapping的數據類型

基本數據類型

屬性名字 說明
text

用於全文索引,該類型的字段將通過分詞器進行分詞,最終用於構建索引

keyword 不分詞
long 有符號64-bit integer:-2^63 ~ 2^63 - 1
integer 有符號32-bit integer,-2^31 ~ 2^31 - 1
short 有符號16-bit integer,-32768 ~ 32767
byte  有符號8-bit integer,-128 ~ 127
double 64-bit IEEE 754 浮點數
float 32-bit IEEE 754 浮點數
half_float 16-bit IEEE 754 浮點數
boolean true,false
date https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html
binary

該類型的字段把值當做經過 base64 編碼的字符串,默認不存儲,且不可搜索

Mapping范圍數據類型

標識一個數據范圍而不是一個值  如age:10~20   搜索{"gle":5,"lte":20} 則可以搜索出來數據

支持的數據類型 說明

integer_range

 

float_range

 

long_range

 

double_range

 

date_range

64-bit 無符號整數,時間戳(單位:毫秒)

ip_range

IPV4 或 IPV6 格式的字符串

可選參數:

relation這只匹配模式

INTERSECTS 默認的匹配模式,只要搜索值與字段值有交集即可匹配到

WITHIN 字段值需要完全包含在搜索值之內,也就是字段值是搜索值的子集才搜索出來

CONTAINS 與WITHIN相反,只搜索字段值包含搜索值的文檔

測試

1.添加index

put:127.0.0.1:9200/range_test

{
  "mappings": {
    "_doc": {
      "properties": {
        "count": {
          "type": "integer_range"
        },
        "create_date": {
          "type": "date_range", 
          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
        }
      }
    }
  }
}

2.添加測試數據

post:127.0.0.1:9200/range_test/_doc/1

{
  "count" : { 
    "gte" : 1,
    "lte" : 100
  },
  "create_date" : { 
    "gte" : "2019-02-1 12:00:00", 
    "lte" : "2019-03-30"
  }
}

3.測試搜索

get:127.0.0.1:9200/range_test/_doc/_search

{
    "query":{
        "term":{
            "count":5
        }
    }
}

5在1~100之間可以搜索出來

{
  "query" : {
    "range" : {
      "create_date" : { 
        "gte" : "2019-02-01",
        "lte" : "2019-03-30",
        "relation" : "within" 
      }
    }
  }
}

Mapping復雜數據類型

數組類型 Array

支持字符串 數值 object對象數組   數組元素必須為相同數據類型

對象類型 Object

{
    "name": "小明",
    "user_info": {
        "student_id": 111,
        "class_info": {
            "class_name": "1年級"
        }
    }
}

被索引形式

{
 "name":"小明",
"user_info.student_id":"111",
"user_info.student_info.class_name":"111"
}

嵌套類型 Nested

能夠支持數組元素單獨的做索引

查詢api:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

聚合api:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html

排序api:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html

檢索和高亮:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#nested-inner-hits

Nested和Object區別

put:127.0.0.1:9200/object_test/_doc/1 默認是object類型

{
    "user_name":"小明",
    "subjects":[
        {"subject_name":"地理","id":1},
        {"subject_name":"英語","id":2}
    ]
}

搜索名字為英語id為1的

{
    "query":{
        "bool":{
        "must":[
            {"match":{"subjects.subject_name":"英語"}},
                {"match":{"subjects.id":"1"}}
            ]
            }
    }
}

正常搜索不出來  測試時搜索出來了

因為索引為以下格式

{
 "name":"小明",
"subjects.subject_name":["英語","地理"],
"subjects.subject_id":["1","2"]
}

改為Nested 就不會

地理數據類型

geo_point

幾種格式

object對象:"location": {"lat": 41.12, "lon": -71.34}

字符串:"location": "41.12,-71.34"

geohash:"location": "drm3btev3e86"

數組:"location": [ -71.34, 41.12 ]

geo_shape

查詢api:https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-bounding-box-query.html

專用數據類型

  • 記錄IP地址 ip
  • 實現自動補全 completion
  • 記錄分詞數 token_count
  • 記錄字符串hash值 murmur3
  • Percolator

Mapping設置

一個完整的mapping設置 

  {
      "settings": {
    "analysis": {
      "analyzer": {
        "ik_pinyin_analyzer": {
          "type": "custom",
          "tokenizer": "ik_smart",
          "filter": ["my_pinyin"]#自定義filter
        },
        "pinyin_analyzer": {
          "tokenizer": "shopmall_pinyin"
        },
        "first_py_letter_analyzer": {
          "tokenizer": "first_py_letter"
        },
        "full_pinyin_letter_analyzer": {
          "tokenizer": "full_pinyin_letter"
        },
        "onlyOne_analyzer": {
          "tokenizer": "onlyOne_pinyin"
        }
      },
      "tokenizer": {#自定義分詞器 
        "onlyOne_pinyin": {
          "type":"pinyin",
          "keep_separate_first_letter": "false",
          "keep_first_letter":"false"
        },
        "shopmall_pinyin": {
          "keep_joined_full_pinyin": "true",
          "keep_first_letter": "true",
          "keep_separate_first_letter": "false",
          "lowercase": "true",
          "type": "pinyin",
          "limit_first_letter_length": "16",
          "keep_original": "true",
          "keep_full_pinyin": "true",
          "keep_none_chinese_in_joined_full_pinyin": "true"
        },
        "first_py_letter": {
          "type": "pinyin",
          "keep_first_letter": true,
          "keep_full_pinyin": false,
          "keep_original": false,
          "limit_first_letter_length": 16,
          "lowercase": true,
          "trim_whitespace": true,
          "keep_none_chinese_in_first_letter": false,
          "none_chinese_pinyin_tokenize": false,
          "keep_none_chinese": true,
          "keep_none_chinese_in_joined_full_pinyin": true
        },
        "full_pinyin_letter": {
          "type": "pinyin",
          "keep_separate_first_letter": false,
          "keep_full_pinyin": false,
          "keep_original": false,
          "limit_first_letter_length": 16,
          "lowercase": true,
          "keep_first_letter": false,
          "keep_none_chinese_in_first_letter": false,
          "none_chinese_pinyin_tokenize": false,
          "keep_none_chinese": true,
          "keep_joined_full_pinyin": true,
          "keep_none_chinese_in_joined_full_pinyin": true
        }
      },
      "filter": {
        "my_pinyin": {
          "type": "pinyin",
          "keep_joined_full_pinyin": true,
          "keep_separate_first_letter":true
        }
      }
    }

  },
      "mappings": {
    "doc": {#type名字
      "properties": {#mapping的屬性
        "productName": {屬性名字
          "type": "text",#屬性類型
          "analyzer": "ik_pinyin_analyzer",#分詞器
          "fields": {#fields 指定自定義分詞器 查詢時通過productName.keyword_once_pinyin 可以指定
            "keyword_once_pinyin": {
              "type": "text",
              "analyzer": "onlyOne_analyzer"#指定的自定義分詞器
            }
          }
        },
        "skuNames": {
          "type": "text",
          "analyzer": "ik_pinyin_analyzer",
          "fields": {
            "keyword_once_pinyin": {
              "type": "text",
              "analyzer": "onlyOne_analyzer"
            }
          }
        },
        "regionCode": {
          "type": "keyword"
        },
        "productNameSuggester": {#es6.x搜索建議實現
          "type": "completion",
          "fields": {
            "pinyin": {
              "type": "completion",
              "analyzer": "pinyin_analyzer"
            },
            "keyword_pinyin": {
              "type": "completion",
              "analyzer": "full_pinyin_letter_analyzer"
            },
            "keyword_first_py": {
              "type": "completion",
              "analyzer": "first_py_letter_analyzer"
            }
          }
        }
          "info": {#es6父子類型設置
          "type": "join",
          "relations": {
            "md_product":[ "sl_customer_character_order_list","ic_product_store_account","sl_customer_product_setting"]
          }
        }
      }
    }
  }
  }
View Code

 

創建mapping

put:http://127.0.0.1:9200/db

{
    "mappings": { "product": {//type "properties": { "productName": {//字段 "type": "text"//數據類型 } } } } }

mapping參數

參數 說明
analyzer 分詞器 默認:standard
boost 字段權重默認1 在通過_all字段查詢 根據此字段來權重
dynamic 控制字段新增 true(默認 允許新增) false  strict 不能新增文檔
index 控制字段是否索引(可搜索) true 是 false否

 

參考:https://www.jianshu.com/p/e8a9feea683c

新增mapping字段

Elasticsearch的mapping一旦創建,只能增加字段,而不能修改已經mapping的字段

put http://127.0.0.1:9200/{indexName}/_mapping/{typeName}

{
  "properties": {
    "productSortItemIds": { #字段名字
     "type": "string",#類型
      "store": true, #是否持久化
      "analyzer": "comma", #分詞器
      "search_analyzer": "comma" #搜索分詞器
    }
  }
}

新增分詞器

 post /{index}/_close #關閉索引
 put  /bbc_product/_settings #增加,號分詞器
{
  "settings": {
    "analysis": {
      "analyzer": {
          "comma": {
                 "type": "pattern",
                 "pattern":","
         }
      }
    }
  }
}
post /{index}/_open #開啟索引

 

 

查看當前索引的映射

http://127.0.0.1:9200/blogs2/product/_mapping 不加/_mapping可看整個index設置

{
    "blogs2": { "mappings": { "product": { "properties": { "price": { "type": "long" }, "productName": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "remark": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "tags": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } } } }

自定義映射

作用定義數據類型 比如數字映射成text 大於小於范圍搜索就會無效 還有明確哪些fullText需要分詞哪些不需要分詞

確切值(Exact values)和全文本(FullText)

es支持很多種數據類型但是主要分為2大類
確切值就是能夠確定的值 比如id 日期  通過=就能查詢到我們想要的數據

而全文本是需要進行相似度匹配 返回最佳匹配

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM