Logstash學習之路(二)Elasticsearch導入json數據文件


 

一、數據從文件導入elasticsearch

1、數據准備:

1、數據文件:test.json
2、索引名稱:index
3、數據類型:doc
4、批量操作API:bulk
{"index":{"_index":"index2","_type":"type2","_id":0}}
{"age":10,"name":"jim"}
    {"index":{"_index":"index2","_type":"type2","_id":1}}
{"age":16,"name":"tom"}

2、_bulk API導入ES的JSON文件需要滿足一定的格式,每條記錄之前,需要有文檔ID且每一行\n結束

curl -H 'Content-Type: application/x-ndjson'  -s -XPOST localhost:9200/_bulk --data-binary @test.json

 

如果是在test.json文件中沒有指定index名、type、id時:

curl -H 'Content-Type: application/x-ndjson'  -s -XPOST localhost:9200/index2/type2/_bulk --data-binary @test.json
{ "index" : { } }
{"age":16,"name":"tom"}

 

但是id會自動生成

3、對於普通json文件的導入,可以logstash進行導入:

logstash的安裝准備詳細過程請查閱:

https://www.cnblogs.com/yfb918/p/10763292.html

json數據准備

[root@master mnt]# cat data.json
{"age":16,"name":"tom"}
{"age":11,"name":"tsd"}

 創建配置文件:

[root@master bin]# cat json.conf 
input{
        file{
                path=>"/mnt/data.json"
                start_position=>"beginning"
                sincedb_path=>"/dev/null"
                codec=>json{
                        charset=>"ISO-8859-1"
                }
        }
}
output{
        elasticsearch{
                hosts=>"http://192.168.200.100:9200"
                index=>"jsontestlogstash"
                document_type=>"doc"
        }
        stdout{}
}

執行結果:

[root@master bin]# ./logstash -f json.conf
[2019-04-25T10:59:14,803][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-04-25T10:59:16,084][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
{
"name" => "tom",
"age" => 16,
"path" => "/mnt/data.json",
"@timestamp" => 2019-04-25T02:59:16.009Z,
"host" => "master",
"@version" => "1"
}
{
"name" => "tsd",
"age" => 11,
"path" => "/mnt/data.json",
"@timestamp" => 2019-04-25T02:59:16.096Z,
"host" => "master",
"@version" => "1"
}

 從結果中可以看到:默認增加了幾個字段。那么我們想要這幾個默認生成的字段我們應該怎么么辦呢,可以如下解決:

在配置文件中使用filter進行過濾:

[root@master bin]# cat json.conf 
input{
        file{
                path=>"/mnt/data.json"
                start_position=>"beginning"
                sincedb_path=>"/dev/null"
                codec=>json{
                        charset=>"ISO-8859-1"
                }
        }
}
filter{ mutate { remove_field => "@timestamp"    
                remove_field => "@version"    
                remove_field => "host"    
                remove_field => "path" } }
output{
        elasticsearch{
                hosts=>"http://192.168.200.100:9200"
                index=>"jsontestlogstash"
                document_type=>"doc"
        }
        stdout{}
}

過濾之后的結果:

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM