Update API

本文轉載自查看原文 2018-08-03 15:21 792 elasticsearch

Update API可以根據提供的腳本更新文檔。該操作從索引獲取文檔，運行腳本（腳本語言和參數是可選的），並返回操作的結果（也允許刪除或忽略該操作）。使用版本控制來確保在“get”(查詢文檔)和“reindex”(重新索引文檔)期間沒有發生更新。

值得注意的是，該操作會重新索引文檔（也就是說更新操作會先查文檔，對文檔合並，刪除之前的文檔，重新添加合並的文檔。），它只是減少了網絡往返以及減少了get（獲取文檔）和index（索引文檔）之間版本沖突的可能性。需要啟用_source字段才能使此特性生效。

比如，索引一個簡單的文檔：

PUT test/_doc/1
{
    "counter" : 1,
    "tags" : ["red"]
}

Scripted updates

以下示例演示了如何執行一個增加counter的腳本：

POST test/_doc/1/_update
{
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    }
}

現在我們就可以往tags列表里添加一個tag（注意，如果tag存在，仍會添加，因為它是一個list）

POST test/_doc/1/_update
{
    "script" : {
        "source": "ctx._source.tags.add(params.tag)",
        "lang": "painless",
        "params" : {
            "tag" : "blue"
        }
    }
}

不止_source，以下變量也可以通過ctx來取得： _index, _type, _id, _version, _routing and _now(當前的時間戳)

以下示例演示了如何獲取_id，比如：

POST test/_doc/1/_update
{
    "script" : "ctx._source.tags.add(ctx._id)"
}

也可以向文檔添加新字段：

POST test/_doc/1/_update
{
    "script" : "ctx._source.new_field = 'value_of_new_field'"
}

從文檔移除某個字段：

POST test/_doc/1/_update
{
    "script" : "ctx._source.remove('new_field')"
}

甚至可以改變已執行的操作。以下示例：如果標簽字段包含green，將刪除doc，否則它不執行任何操作（即該操作會被忽略，返回noop）：

POST test/_doc/1/_update
{
    "script" : {
        "source": "if (ctx._source.tags.contains(params.tag)) { ctx.op = 'delete' } else { ctx.op = 'none' }",
        "lang": "painless",
        "params" : {
            "tag" : "green"
        }
    }
}

更新部分文檔

update API還支持傳遞部分文檔，該部分文檔將合並到現有文檔中（簡單的遞歸合並，對象的內部合並，替換核心"keys/values"和數組）。要完全替換現有文檔，應使用index API。以下示例演示了如何使用部分更新向現有文檔添加新字段：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

如果同時指定了doc和script，會報錯。最好是將部分文檔的字段對放在腳本本身中（目前我還不知道該怎么操作）。

POST test/_doc/1/_update
{
  "doc" : {
        "age" : "18"
    },
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    }
}

返回結果如下：

{
  "error": {
    "root_cause": [
      {
        "type": "action_request_validation_exception",
        "reason": "Validation Failed: 1: can't provide both script and doc;"
      }
    ],
    "type": "action_request_validation_exception",
    "reason": "Validation Failed: 1: can't provide both script and doc;"
  },
  "status": 400
}

檢測noop更新
如果指定了doc，則其值將與現有_source合並。默認情況下，不更改任何內容的更新，會檢測到並會返回“result”：“noop”，如下所示：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

如果在發送請求之前name是new_name，則忽略整個更新請求。如果請求被忽略，響應中的result元素將返回noop。

{
  "_index": "test",
  "_type": "_doc",
  "_id": "1",
  "_version": 2,
  "result": "noop",
  "_shards": {
    "total": 0,
    "successful": 0,
    "failed": 0
  }
}

設置"detect_noop": false可以禁用這種默認行為：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "detect_noop": false
}

Upserts

如果文檔尚不存在，則upsert元素的內容將作為新文檔插入。如果文檔確實存在，則執行腳本：

POST test/_doc/1/_update
{
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    },
    "upsert" : {
        "counter" : 1
    }
}

當然，不一定非得腳本，下面這樣也是可以的，文檔不存在的時候執行upsert內容，文檔存在的時候執行doc的內容：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "upsert" : {
        "counter" : 10
    }
}

scripted_upsert
如果希望無論文檔是否存在，都運行腳本（即使用腳本處理初始化文檔而不是upsert元素）可以將scripted_upsert設置為true：

POST sessions/session/dh3sgudg8gsrgl/_update
{
    "scripted_upsert":true,
    "script" : {
        "id": "my_web_session_summariser",
        "params" : {
            "pageViewEvent" : {
                "url":"foo.com/bar",
                "response":404,
                "time":"2014-01-01 12:32"
            }
        }
    },
    "upsert" : {}
}

下面來看看和直接寫腳本不用upsert的區別，當文檔不存在時，直接下面這樣寫會報錯。

POST test/_doc/1/_update
{
    "scripted_upsert":true,
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    }
}

返回錯誤消息如下：

{
  "error": {
    "root_cause": [
      {
        "type": "document_missing_exception",
        "reason": "[_doc][1]: document missing",
        "index_uuid": "YgmlkeEERGm20yUBDJHKtQ",
        "shard": "3",
        "index": "test"
      }
    ],
    "type": "document_missing_exception",
    "reason": "[_doc][1]: document missing",
    "index_uuid": "YgmlkeEERGm20yUBDJHKtQ",
    "shard": "3",
    "index": "test"
  },
  "status": 404
}

設置scripted_upsert：true，當文檔不存在時，執行下面的代碼：

POST test/_doc/1/_update
{
    "scripted_upsert":true,
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    },
    "upsert" : {
        "counter" : 10
    }
}

返回的結果如下：

{
  "_index": "test",
  "_type": "_doc",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 6,
  "_primary_term": 1
}

可見，執行成功了，下面來看看文檔：

{
  "_index": "test",
  "_type": "_doc",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "counter": 14
  }
}

counter的值為14，可見是先執行了upsert的內容，然后執行了腳本。

doc_as_upsert
將doc_as_upsert設置為true將使用doc的內容作為upsert值，而不是發送部分doc加上upsert文檔：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "doc_as_upsert" : true
}

下面來看看和直接寫doc的區別：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

當文檔不存在時，設置doc_as_upsert為true，可以成功執行。而上面這種情況會報錯，提示文檔不存在。如果向下面這樣寫會出現什么情況呢？

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "upsert" : {
        "counter" : 10
    },
    "doc_as_upsert" : true
}

結果是upsert永遠不會被執行，不管文檔存在不存在，始終執行的是doc的內容。

Parameters

update操作支持以下query-string(跟在請求url后面)參數：

retry_on_conflict：在更新的get和indexing階段之間，另一個進程可能已經更新了同一文檔。默認情況下，會更新失敗，因為版本沖突異常。 retry_on_conflict參數控制在最終拋出異常之前重試更新的次數。

routing：路由用於將更新請求路由到正確的分片，以及在將要更新的文檔不存在時為upsert請求設置路由。不能用於更新現有文檔的路由。

timeout：設置等待分片變為可用的時間。

wait_for_active_shards：在繼續更新操作之前需要處於活動狀態的分片副本數。詳情請見此處。

refresh：控制何時該請求所做的更改對搜索可見。看refresh。

_source：允許控制是否返回以及如何在響應中返回更新的源。默認情況下，不會返回更新的源。請參閱源過濾了解詳細信息

version：update API在內部使用Elasticsearch的versioning支持，以確保在更新期間文檔不會更改。可以使用version參數指定僅在文檔版本與指定版本匹配時才更新文檔。

update API不支持internal以外的版本，也就是說update API不支持外部（版本類型external＆external_gte）或強制（版本類型force）版本，因為它會導致Elasticsearch版本號與外部系統不同步。請改用index API。

官方文檔：https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 ElasticSearch(六) Update API elasticsearch 基礎 —— Update By Query API Java 操作 ElasticSearch API 中的 update elasticsearch _update api 更新部分字段內容關於源碼編譯每次提示有錯誤要make update-api 編譯時出現錯誤make update-api Elasticsearch-Java API操作（一）API基本操作（10）【更新文檔數據（update）】用ASP.NET Core 2.0 建立規范的 REST API -- DELETE, UPDATE, PATCH 和 Log docker API接口service update錯誤記錄 error while removing network:… org.flywaydb.core.api.FlywayException: Validate failed: Detected failed migration to version 1.0.9 (update)