es 之自定義 mapping（五）

本文轉載自查看原文 2020-10-08 21:37 412 數據庫/ ES

當我們往 es 中插入數據時，若索引不存在則會自動創建，mapping 使用默認的；但是有時默認的映射關系不能滿足我們的要求，我們可以自定義
mapping 映射關系。

mapping 即索引結構，可以看做是數據庫中的表結構，包含字段名、字段類型、倒排序索引相關設置。

映射關系

每個索引都有一個映射類型，決定了文檔將如何被索引，索引類型有：

元字段 meta-fields：用於自定義如何處理文檔關聯的元數據，如：_index、_type、_source 等字段
字段或屬性 field or properties：映射類型包含與文檔相關的字段或者屬性的列表

字段的數據類型

字符串類型: text 或者 keyword
數值類型: integer、long、short、byte、double、float等
布爾類型: boolean
日期類型: date
二進制類型: binary
范圍類型: integer_range、double_range、date_range、float_range
數組類型: array
對象類型: object
嵌套類型: nested object
地理位置數據類型: geo_point、geo_shape
專用類型: ip、join、token count、percolator 等

keyword 類型不會分詞，text 會分詞，因此 keyword 比 text 更節省空間，效率也更高。

自定義 mapping

PUT mapping_test
{
  "mappings": {
    "test1": {
      "properties": {
        "name": {"type": "text"},
        "age": {"type": "long"}
      }
    }
  }
}

參數

mapping_test：索引名
mappings：關鍵字
test1：_type 名稱
properties：關鍵字
name、age：字段名

以上會創建一個新的索引 mapping_test，其中 mapping 信息是我們自定義的，若返回以下信息，表示創建成功：

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "mapping_test"
}

查看 mapping：

GET mapping_test/_mapping

查詢結果：

{
  "mapping_test" : {
    "mappings" : {
      "test1" : {
        "properties" : {
          "age" : {
            "type" : "long"
          },
          "name" : {
            "type" : "text"
          }
        }
      }
    }
  }
}

mapping 中的參數

analyzer

字段分詞器，默認為 standard，可以指定第三方的分詞器：

PUT mapping_test
{
  "mappings": {
    "test1": {
      "properties": {
        "name": {
            "type": "text",
            "analyzer": "ik_smart"      # 使用 ik 中文分詞器
        },
      }
    }
  }
}

boost

查詢時提高字段的相關性算分，得分越高在查詢結果集中排名越靠前，boost 可以指定其分數（權重），默認 1.0：

PUT mapping_test
{
  "mappings": {
    "test1": {
      "properties": {
        "name": {
            "type": "text",
            "boost": 2
        },
      }
    }
  }
}

copy_to

該屬性將多個字段的值拷貝到指定字段，然后可以將其作為單個字段查詢，以下將 first_name、last_name 的值拷貝到 full_name 字段中：

# 創建索引
PUT my_index
{
  "mappings": {
    "doc": {
      "properties": {
        "first_name": {
          "type": "text",
          "copy_to": "full_name"
        },
        "last_name": {
          "type": "text",
          "copy_to": "full_name"
        },
        "full_name": {
          "type": "text"
        }
      }
    }
  }
}

# 查詢數據
PUT my_index/doc/1
{
  "first_name": "John",
  "last_name": "Smith"
}

查詢：

GET my_index/doc/_search
{
  "query": {
    "match": {
      "full_name": {
        "query": "John"
      }
    }
  }
}

查詢結果：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith"
        }
      }
    ]
  }
}

dynamic

創建索引時，索引中字段是固定的，該屬性可以決定是否允許新增字段，有三種狀態：

true：允許新增，es 會自動添加映射關系
false：允許新增，不會自動添加映射關系，但是不能作為主查詢查詢（查詢不到具體的新增字段）
strict：嚴格模式，不可以新增字段，新增就報錯，需要重新設計索引

1、dynamic 為 true 時

PUT s1
{
  "mappings": {
    "doc": {
      "dynamic": true,
      "properties": {
        "name": {"type": "text"}
      }
    }
  }
}

# 插入數據，新增了一個 age 字段
PUT s1/doc/1
{
  "name": "rose",
  "age": 19
}

# 可以使用 age 作為主查詢條件查詢
GET s1/doc/_search
{
  "query": {
    "match": {
      "age": 19
    }
  }
}

創建索引、插入數據，查詢都沒有問題

2、dynamic為 false 時

PUT s2
{
  "mappings": {
    "doc": {
      "dynamic": false,
      "properties": {
        "name": {"type": "text"}
      }
    }
  }
}

# 插入數據，新增了一個 age 字段
PUT s2/doc/1
{
  "name": "rose",
   "age": 19
}

# 使用 age 字段作為主條件查詢
GET s2/doc/_search
{
  "query": {
    "match": {
      "age": 19
    }
  }
}

查詢結果：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

創建索引、插入數據，新增字段作為主條件查詢查詢為空，查詢不到數據。

3、dynamic 為 strict 時：

PUT s3
{
  "mappings": {
    "doc": {
      "dynamic": "strict",
      "properties": {
        "name": {"type": "text"}
      }
    }
  }
}

PUT s3/doc/1
{
  "name": "rose",
   "age": 19
}

嚴格模式下，禁止插入，插入就出錯：

{
  "error": {
    "root_cause": [
      {
        "type": "strict_dynamic_mapping_exception",
        "reason": "mapping set to strict, dynamic introduction of [age] within [doc] is not allowed"
      }
    ],
    "type": "strict_dynamic_mapping_exception",
    "reason": "mapping set to strict, dynamic introduction of [age] within [doc] is not allowed"
  },
  "status": 400
}

index

index 屬性默認為 true，若設置為 false，那么 es 不會為該屬性創建索引，即不能當前主條件查詢，查詢會報錯：

PUT s5
{
  "mappings": {
    "doc": {
      "properties": {
        "t1": {
          "type": "text",
          "index": true
        },
        "t2": {
          "type": "text",
          "index": false
        }
      }
    }
  }
}

PUT s5/doc/1
{
  "t1": "論母豬的產前保養",
  "t2": "論母豬的產后護理"
}

GET s5/doc/_search
{
  "query": {
    "match": {
      "t1": "母豬"
    }
  }
}

# t2 字段 index 設置為 false，作為主條件查詢
GET s5/doc/_search
{
  "query": {
    "match": {
      "t2": "母豬"
    }
  }
}

t2 字段 index 設置為 false，作為主條件查詢時會報錯：

{
  "error": {
    "root_cause": [
      {
        "type": "query_shard_exception",
        "reason": "failed to create query: {\n  \"match\" : {\n    \"t2\" : {\n      \"query\" : \"母豬\",\n      \"operator\" : \"OR\",\n      \"prefix_length\" : 0,\n      \"max_expansions\" : 50,\n      \"fuzzy_transpositions\" : true,\n      \"lenient\" : false,\n      \"zero_terms_query\" : \"NONE\",\n      \"auto_generate_synonyms_phrase_query\" : true,\n      \"boost\" : 1.0\n    }\n  }\n}",
        "index_uuid": "jTRViM6SSRSERtEcSTSOFQ",
        "index": "s5"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "s5",
        "node": "d8Q4szIXR8KlHOram-TICA",
        "reason": {
          "type": "query_shard_exception",
          "reason": "failed to create query: {\n  \"match\" : {\n    \"t2\" : {\n      \"query\" : \"母豬\",\n      \"operator\" : \"OR\",\n      \"prefix_length\" : 0,\n      \"max_expansions\" : 50,\n      \"fuzzy_transpositions\" : true,\n      \"lenient\" : false,\n      \"zero_terms_query\" : \"NONE\",\n      \"auto_generate_synonyms_phrase_query\" : true,\n      \"boost\" : 1.0\n    }\n  }\n}",
          "index_uuid": "jTRViM6SSRSERtEcSTSOFQ",
          "index": "s5",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "Cannot search on field [t2] since it is not indexed."
          }
        }
      }
    ]
  },
  "status": 400
}

ignore_above

超過 ignore_above 設置的字符串將不會被索引或存儲，對於字符串數組，ignore_above 將分別應用於每個數組元素，並且字符串元素 ignore_above 將不會被索引或存儲。

PUT s6
{
  "mappings": {
    "doc": {
      "properties": {
        "t1": {
          "type": "keyword",
          "ignore_above": 10
        }
      }
    }
  }
}

PUT s6/doc/1
{
  "t1": "123456"
}

# 超過 ignore_above 10
PUT s6/doc/2
{
  "t1": "1234567891011121314151617181920"
}

# 查詢時為空
GET s6/doc/_search
{
  "query": {
    "match": {
      "t1": "1234567891011121314151617181920"
    }
  }
}

查詢結果：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

注意：字段啟用 ignore_above 時，字段類型不能為 text，超過 ignore_above ，不會被索引，即查詢不到具體數據。

index_options

控制倒排序索引記錄的內容，可選項：

docs：只記錄文檔 id
freqs:記錄文檔 id、單詞頻率
positions:記錄文檔 id、詞頻、單詞位置
offsets:記錄文檔 id、詞頻、單詞位置、偏移量

其中 text 類型字段默認的 index_options 為 positions，其余類型默認為 docs，同時記錄的內容越多，占用的空間也越大。

fields

允許為字段設置子字段，可以有多個，如檢索人的中文姓名和拼音姓名，把 name_pinyin 這個字段掛在 name_cn 字段下：

PUT s7
{
  "mappings": {
    "doc": {
      "properties": {
        "name_cn": {
          "type": "text",
          "fields": {
            "name_pinyin": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

PUT s7/doc/1
{
  "name_cn": "張三",
  "name_pinyin": "zhangsan"
}

GET s7/doc/_search
{
  "query": {
    "match": {
      "name_pinyin": "zhangsan"
    }
  }
}

null_value

當字段遇到 null 值時的處理策略（字段為 null 時不會被搜索的，text 類型的字段不能使用該屬性），設置該值后可以用你設置的值替換null 值，這點可類比 mysql 中的 "default" 設置默認值。

PUT s8
{
  "mappings": {
    "doc": {
      "properties": {
        "name_cn": {
          "type": "keyword",
          "null_value": "張三"
        }
      }
    }
  }
}

search_analyzer

指定搜索時分詞器，這一要注意，在 es 之分詞中說到過，分詞的兩個時機是索引時分詞和搜索時分詞，一般情況下使用索引時分詞即可，所以如果你同時設置了兩個，那么這兩個分詞器最好保持一致，不然可能出現搜索匹配不到數據的問題。

PUT s10
{
  "mappings": {
    "doc": {
      "properties": {
        "name": {
          "type": "text",
          "analyzer": "standard",
          "search_analyzer": "standard"
        }
      }
    }
  }
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 elasticsearch Mapping使用自定義分詞器 ES中 mapping定義的導入和導出 filebeat直接給es傳輸日志，自定義索引名，自定義多個索引文件自定義fragmentlayout 自定義mybatis 自定義AuthorizeFilter 如何自定義starter 自定義UIAlertView 自定義日歷(一) 自定義UIAlertView

es 之 自定義 mapping（五）

映射關系

字段的數據類型

自定義 mapping

mapping 中的參數

analyzer

boost

copy_to

dynamic

index

ignore_above

index_options

fields

null_value

search_analyzer

免責聲明！

es 之自定義 mapping（五）