elasticsearch數據過期刪除處理


一、概述

使用elasticsearch收集日志進行處理,時間久了,很老的數據就沒用了或者用途不是很大,這個時候就要對過期數據進行清理.這里介紹兩種方式清理這種過期的數據。

1、curator

關於版本:

 

安裝:

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/installation.html

我使用的是ubuntu系統,所以參考的是https://www.elastic.co/guide/en/elasticsearch/client/curator/current/apt-repository.html

wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
 vim /etc/apt/sources.list.d/curator.list deb [arch=amd64] https://packages.elastic.co/curator/5/debian stable main

sudo apt-get update && sudo apt-get install elasticsearch-curator

 我使用的是elasticsearch-6.5.1,所以安裝的是curator5.

安裝完成后會生成兩個命令:curator、curator_cli,這里我們只先用到curator。

需要創建配置文件:有兩個文件一個是config、一個是action

mkdir  {/etc/curator,/data/curator}

config:

# cat config_file.yml client: hosts: - 127.0.0.1 port: 9200 url_prefix: use_ssl: False certficate: client_cert: client_key: ssl_no_validate: False http_auth: timeout: master_only: true logging: loglevel: INFO logfile: "/data/curator/action.log" logformat: default

action:

# cat action_file.yml

--- actions: 1: action: delete_indices description: >- Delete indices older than 7 days (based on index name), for logstash- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: regex value: '^apm-6.5.1-transaction-|^apm-6.5.1-span-' exclude: - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: 15 exclude: 2: action: delete_indices description: >- Delete indices older than 7 days (based on index name), for logstash- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly. options: ignore_empty_list: True timeout_override: continue_if_exception: False disable_action: False filters: - filtertype: pattern kind: prefix value: loadbalance-api- exclude: - filtertype: age source: name direction: older timestring: '%Y-%m-%d' unit: days unit_count: 20 exclude:

 

---
actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 7 days (based on index name), for logstash-
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: 'fluentd-k8s-(2019.02.11|2019.02.12)$'
      exclude: true
    - filtertype: pattern
      kind: prefix
      value: fluentd-k8s-
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 15
      exclude:

 

 

 可以設置多個action,每個都以不同的數字分割,使用不同的清理策略,具體可以參考https://www.elastic.co/guide/en/elasticsearch/client/curator/5.6/actions.html

注意自己的index的格式,比如我這里的時間格式有兩種:

注意匹配,否則那個action就返回空列表,從而不會刪除。

這個歷史數據重要的會先落地到hdfs,然后在刪除。這個日期根據自己服務器的磁盤和日志的重要性自己規划。重要的比如雙11的數據不想刪除,想留下來可以寫到exclude里面,

或者做一個snapshot備份。接下來設置一個定時任務去刪除就好了。

crontab -e *  *  */25 * *  curator --config /etc/curator/config_file.yml  /etc/curator/action_file.yml

 

2、使用腳本刪除

 

# cat es-dele-indices.sh #!/bin/bash #delete elasticsearch indices searchIndex=fluentd-k8s elastic_url=127.0.0.1 elastic_port=9200 date2stamp(){ date --utc --date "$1" +%s } dateDiff(){ case $1 in
    -s)  sec=1;     shift;; -m)  sec=60;    shift;; -h)  sec=3600;  shift;; -d)  sec=86400; shift;; *)  sec=86400; shift;; esac dte1=$(date2stamp $1) dte2=$(date2stamp $2) diffSec=$((dte2-dte1)) if ((diffSec < 0)); then abs=-1; else abs=1; fi
  echo $((diffSec/sec*abs)) } for index in $(curl -s "${elastic_url}:${elastic_port}/_cat/indices?v" | grep -E " ${searchIndex}-20[0-9][0-9]\.[0-1][0-9]\.[0-3][0-9]" | awk '{ print $3 }');do
  date=$(echo ${index: -10}|sed 's/\./-/g') cond=$(date +%Y-%m-%d) diff=$(dateDiff -d $date $cond) echo -n "${index} (${diff})"
  if [ $diff -gt 1 ]; then #echo "/ DELETE" curl -XDELETE "${elastic_url}:${elastic_port}/${index}?pretty"
  else
    echo ""
  fi
done

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM