（11）ElasticSearch mapping解釋與說明

本文轉載自查看原文 2019-08-31 16:25 864 ElasticSearch

　　在es中，執行一個PUT操作，es會自動創建索引，自動創建索引下的類型，其實es還創建了mapping。mappingd定義了type中的每一個字段的數據類型以及這些字段如何分詞等相關屬性。創建索引的時候，可以預先定義字段的類型以及相關屬性，這樣就能夠把日期字段處理成日期，把數字字段處理成數字，把字符串字段處理成字符串值等。學習mapping先創建一個文檔，如下：

PUT /myindex/article/1
{
  "post_date":"2018-05-10",
  "title":"Java",
  "content":"java is the best language",
  "author_id":119
}

　　查看mapping的語句：GET /myindex/article/_mapping。結果如下：

{
  "myindex": {
    "mappings": {
      "article": {
        "properties": {
          "author_id": {
            "type": "long"
          },
          "content": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "post_date": {
            "type": "date"
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

　　可以看到查詢出了索引是myindex、類型是article。

　　author_id字段類型是long；content類型是text；post_date類型是date；title類型是text。es會自動識別字段類型。

　　es是支持數據類型的，它自動創建的映射是動態映射（dynamic mapping）。

　　es支持的數據類型如下：

　　（1）核心數據類型（Core datatype）

　　字符串：string，包括 text和keyword。text類型被用來索引長文本。在建立索引前會將這些文本進行分詞，轉化為詞的組合。建立索引，允許es來檢索這些詞語。text類型不能用來排序和聚合。keyword類型不需要進行分詞。可以被用來檢索過濾、排序和聚合。keyword類型字段只能用本身來進行檢索。

　　數字型：long、integer、short、byte、double、float

　　日期型：date

　　布爾型：boolean

　　二進制型：binary

　　日期、數值型不會分詞，只能全部匹配查詢，字符串可以分詞，能模糊查詢，舉例如下：

　　添加如下兩條數據，結合開始添加的數據，共3條數據：

PUT /myindex/article/2
{
  "post_date":"2018-05-12",
  "title":"html",
  "content":"i like html",
  "author_id":120
}

PUT /myindex/article/3
{
  "post_date":"2018-05-16",
  "title":"es",
  "content":"Es is distributed document store",
  "author_id":110
}

　　執行查詢，結果：

　　GET /myindex/article/_search?q=post_date:2018　　不會查出數據

　　GET /myindex/article/_search?q=post_date:2018-05　　不會查出數據

　　GET /myindex/article/_search?q=post_date:2018-05-10　　會查出數據

　　GET /myindex/article/_search?q=html　　會查出數據

　　GET /myindex/article/_search?q=java　　會查出數據

　　（2）復雜數據類型（Complex datatypes）

　　數組類型（Array datatype）:數組類型不需要專門指定數組元素的type，例如：

　　字符型數組：["one","two"]

　　整型數組：[1,2]

　　數組型數組：[1,[2,3]]等價於[1,2,3]

　　對象數組：[{"name":"Mary","age":12},{"name":"John","age":10}]

　　對象類型（Object datatype）：_object_用於單個json對象

　　嵌套類型（Nested datatype）: _nested_用於json數組

　　舉例說明：

PUT /lib/person/1
{
    "name":"Tom",
    "age":25,
    "birthday":"1985-12-12",
    "address":{
        "country":"china",
        "province":"guangdong",
        "city":"shenzhen"
    }
}

　　底層存儲格式為：

{
    "name":["Tom"],
    "age":[25],
    "birthday":["1985-12-12"],
    "address.country":["china"],
    "address.province":["guangdong"],
    "address.city":["shenzhen"]
}

PUT /lib/person/2
{
    "persons":[
        {"name":"lisi","age":27},
        {"name":"wangwu","age":26},
        {"name":"zhangsan","age":23}
    ]
}

　　底層存儲格式為：

{
    "persons.name":["lisi","wangwu","zhangsan"],
    "persons.age":[27,26,23]
}

　　（3）地理位置類型（Geo datatypes）

　　地理坐標類型（Geo-point datatype）: _geo_point_用於經緯度坐標

　　地理形狀類型（Geo-Shape datatype）：_geo_shape_用於類似於多邊形的復雜形狀

　　（4）特定類型（Specialised datatypes）
　　IPv4類型（IPv4 datatype）：_ip_用於IPv4地址

　　Completion類型（Completion datatype）: _completion_提供自動補全建議

　　Token count類型（Token count datatype）: _token_count_ 用於統計做了標記的字段的index數目，該值會一直增加，不會因為過濾條件而減少。

　　mapper-murmur3類型：通過插件，可以通過 _murmur3來計算index的hash值：

　　附加類型（Attachment datatype）:采用mapper-attachments插件，可支持_attachements_ 索引，如 Microsoft Office格式，Open Document格式，ePub,HTML等。

　　字段支持的屬性：

　　"store": 字段上的值是不是被存儲，如果沒有存儲就只能搜索，不能獲取值，默認false，不存儲

　　"index": true//分詞,false//不分詞，字段將不會被索引

　　"analyzer": "ik"//指定分詞器，默認分詞器為standard analyzer

　　"boost": 1.23//字段級別的分數加權，默認值是1.0

　　"ignore_above": 100//超過100個字符的文本，將會被忽略，不被索引

　　"search_analyzer": "ik"//設置搜索時的分詞器，默認跟ananlyzer是一致的，比如index時用standard+ngram,搜索時用standard來完成自動提示功能。

　　手動創建mapping

put /lib
{
    "settings":{
        "number_of_shards":3,
        "number_of_replicas":0
    },
    "mappings":{
        "books":{
            "properties":{
                "title":{"type":"text"},
                "name":{"type":"text","analyzer":"standard"},
                "publish_date":{"type":"date","index":false},
                "price":{"type":"double"},
                "number":{"type":"integer"}
            }
        }
    }
}

　　指定了類型是books，字段name的分詞器是analyzer，publish_date不使用分詞索引。假如添加了一個新字段，新字段會按照默認的屬性創建，如下：

PUT /lib/books/1
{
  "title":"java is good",
  "name":"java",
  "publish_date":"2019-01-12",
  "price":23,
  "number":46,
  "mark":"no"
}

　　查看一下mapping情況：

　　GET lib/books/_mapping

{
  "lib": {
    "mappings": {
      "books": {
        "properties": {
          "mark": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "name": {
            "type": "text",
            "analyzer": "standard"
          },
          "number": {
            "type": "integer"
          },
          "price": {
            "type": "double"
          },
          "publish_date": {
            "type": "date",
            "index": false
          },
          "title": {
            "type": "text"
          }
        }
      }
    }
  }
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 ES 11 - 配置Elasticsearch的映射 (mapping) ElasticSearch(5)-Mapping ElasticSearch 6.2 Mapping參數說明及text類型字段聚合查詢配置 ElasticSearch index、mapping、document elasticsearch 修改 mapping Elasticsearch系列---初識mapping Elasticsearch 查看模板與mapping Elasticsearch索引與mapping映射 elasticsearch index 之 put mapping elasticsearch文檔-字段的mapping