data stream的背后可以認為是一組自動創建的index。
數據流允許跨多個index僅追加時間序列數據,同時為請求提供單個index的命名(別名)。數據流非常適合於日志、事件、度量和其他連續生成的數據。
可以直接向數據流提交索引和搜索請求。流自動將請求路由到存儲流數據的備份索引。您可以使用索引生命周期管理(ILM)來自動管理這些備份索引。
讀數據
寫數據
不能對其他index增加文檔,即便是指定全名也不可以。對正在可寫的index不能操作:
generation
index生成規則:一個六位數的零填充整數,作為流滾動的累積計數,從000001開始。
index的完整名稱將會是
.ds-<data-stream>-<yyyy.MM.dd>-<generation>
例如 .ds-my-data-stream-2021.10.27-000001
append-only 不能將現有文檔的更新或刪除請求直接發送到data stream,可以使用 update by query and delete by query
如果有必要,可以指定完整的index名稱進行更新、刪除。
如果需要經常更新、刪除操作的,使用index template 加 index別名的方式,而不是使用data stream。詳見 Manage time series data without data streams.
創建Data stream
通常的步驟:
- Create an index lifecycle policy 創建ILM
- Create component templates 不是必須的
- Create an index template 創建index template
- Create the data stream 創建data stream
- Secure the data stream 權限控制,不是必須的
創建ILM
PUT _ilm/policy/my-lifecycle-policy { "policy": { "phases": { "hot": { "actions": { "rollover": { "max_primary_shard_size": "50gb" } } }, "warm": { "min_age": "30d", "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } }, "cold": { "min_age": "60d", "actions": { "searchable_snapshot": { "snapshot_repository": "found-snapshots" } } }, "frozen": { "min_age": "90d", "actions": { "searchable_snapshot": { "snapshot_repository": "found-snapshots" } } }, "delete": { "min_age": "735d", "actions": { "delete": {} } } } } }
這里創建2個_component_template供index template使用
PUT _component_template/my-mappings { "template": { "mappings": { "properties": { "@timestamp": { "type": "date", "format": "date_optional_time||epoch_millis" }, "message": { "type": "wildcard" } } } }, "_meta": { "description": "Mappings for @timestamp and message fields", "my-custom-meta-field": "More arbitrary metadata" } } PUT _component_template/my-settings { "template": { "settings": { "index.lifecycle.name": "my-lifecycle-policy" } }, "_meta": { "description": "Settings for ILM", "my-custom-meta-field": "More arbitrary metadata" } }
創建index template
PUT _index_template/my-index-template { "index_patterns": ["my-data-stream*"], "data_stream": { }, "composed_of": [ "my-mappings", "my-settings" ], "priority": 500, "_meta": { "description": "Template for my time series data", "my-custom-meta-field": "More arbitrary metadata" } }
接下來可以自動創建data stream了
PUT my-data-stream/_bulk { "create":{ } } { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" } { "create":{ } } { "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" } POST my-data-stream/_doc { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" }
也可以使用 PUT _data_stream/my-data-stream 來創建
查詢data stream GET _data_stream/my-data-stream
刪除data stream DELETE _data_stream/my-data-stream
使用Data stream通常有以下應用:
- Add documents to a data stream
- Search a data stream
- Get statistics for a data stream
- Manually roll over a data stream
- Open closed backing indices
- Reindex with a data stream
- Update documents in a data stream by query
- Delete documents in a data stream by query
- Update or delete documents in a backing index
增加文檔
POST /my-data-stream/_doc/ { "@timestamp": "2099-03-08T11:06:07.000Z", "user": { "id": "8a4f500d" }, "message": "Login successful" }
如果指定ID時,不能使用 PUT /<target>/_doc/<_id> ,但可以使用PUT /<target>/_create/<_id>。
而_bulk只支持新增文檔。
查詢文檔
跟index的查詢是相同的
查詢Data stream的狀態度量數據
GET /_data_stream/my-data-stream/_stats?human=true
手動rollover
POST /my-data-stream/_rollover/
開啟關閉背后的index
不能對closed的backing index進行查詢、更新、刪除。
如要reopen可以使用 POST /.ds-my-data-stream-2099.03.07-000001/_open/ , 也可以開啟全部closed的backing index POST /my-data-stream/_open/
Reindex到Data stream
POST /_reindex { "source": { "index": "archive" }, "dest": { "index": "my-data-stream", "op_type": "create" } }
POST /my-data-stream/_update_by_query { "query": { "match": { "user.id": "l7gk7f82" } }, "script": { "source": "ctx._source.user.id = params.new_id", "params": { "new_id": "XgdX0NoX" } } }
POST /my-data-stream/_delete_by_query { "query": { "match": { "user.id": "vlb44hny" } } }
指定backing index更新或刪除文檔
先查詢得到index名稱和文檔ID
修改mappings和settings
由於data stream有一個index template,它的mappings和settings是來自index template的,因此最初要考慮好使用的mappings和settings。
在后續如果想做變更,例如
- Add a new field mapping to a data stream
- Change an existing field mapping in a data stream
- Change a dynamic index setting for a data stream
- Change a static index setting for a data stream
增加字段
首先在index template上增加字段,這樣后續自動創建的index將會有新字段
PUT /_index_template/my-data-stream-template { "index_patterns": [ "my-data-stream*" ], "data_stream": { }, "priority": 500, "template": { "mappings": { "properties": { "message": { "type": "text" } } } } }
再對已存在的backing index也增加字段,這將對所有的backing index起作用,包括write的index
PUT /my-data-stream/_mapping { "properties": { "message": { "type": "text" } } }
也可以只對write的index增加字段
PUT /my-data-stream/_mapping?write_index_only=true { "properties": { "message": { "type": "text" } } }
修改已存在的字段
因為ES的字段type是不能修改的,但可以修改其他的參數配置
首先修改index template
PUT /_index_template/my-data-stream-template { "index_patterns": [ "my-data-stream*" ], "data_stream": { }, "priority": 500, "template": { "mappings": { "properties": { "host": { "properties": { "ip": { "type": "ip", "ignore_malformed": true } } } } } } }
以上修改了 "ignore_malformed": true
再對已存在的backing index也作此修改,同上面增加字段
修改index的dynamic settings
同樣也是以上步驟,使用對應的api
修改index的static settings
修改index template的settings,跟dynamic不同,static的修改只能對未來新增的backing index起作用。如果想要立即生效,可以使用手動rollover立即產生新的backing index達到效果。
使用reindex修改字段類型
跟index的reindex類似,data stream也可以reindex,實現例如@timestamp的date類型轉date_nanos類型