Flume原理、安裝和使用

本文轉載自查看原文 2015-07-02 01:09 2626 Hadoop

1.flume是分布式的日志收集系統，把收集來的數據傳送到目的地去。

2.flume里面有個核心概念，叫做agent。agent是一個java進程，運行在日志收集節點。

3.agent里面包含3個核心組件：source、channel、sink。

3.1 source組件是專用於收集日志的，可以處理各種類型各種格式的日志數據,包括avro、thrift、exec、jms、spooling directory、netcat、sequence generator、syslog、http、legacy、自定義。

source組件把數據收集來以后，臨時存放在channel中。

3.2 channel組件是在agent中專用於臨時存儲數據的，可以存放在memory、jdbc、file、自定義。

channel中的數據只有在sink發送成功之后才會被刪除。

3.3 sink組件是用於把數據發送到目的地的組件，目的地包括hdfs、logger、avro、thrift、ipc、file、null、hbase、solr、自定義。

4.在整個數據傳輸過程中，流動的是event。事務保證是在event級別。

5.flume可以支持多級flume的agent，支持扇入(fan-in)、扇出(fan-out)。

6.書寫配置文件example

#agent1表示代理名稱

agent1.sources=source1

agent1.sinks=sink1

agent1.channels=channel1

#Spooling Directory是監控指定文件夾中新文件的變化，一旦新文件出現，就解析該文件內容，然后寫入到channle。寫入完成后，標記該文件已完成或者刪除該文件。

#配置source1

agent1.sources.source1.type=spooldir

agent1.sources.source1.spoolDir=/root/hmbbs

agent1.sources.source1.channels=channel1

agent1.sources.source1.fileHeader = false

agent1.sources.source1.interceptors = i1

agent1.sources.source1.interceptors.i1.type = timestamp

#配置sink1

agent1.sinks.sink1.type=hdfs

agent1.sinks.sink1.hdfs.path=hdfs://hadoop0:9000/hmbbs

agent1.sinks.sink1.hdfs.fileType=DataStream

agent1.sinks.sink1.hdfs.writeFormat=TEXT

agent1.sinks.sink1.hdfs.rollInterval=1

agent1.sinks.sink1.channel=channel1

agent1.sinks.sink1.hdfs.filePrefix=%Y-%m-%d

#配置channel1

agent1.channels.channel1.type=file

agent1.channels.channel1.checkpointDir=/root/hmbbs_tmp/123

agent1.channels.channel1.dataDirs=/root/hmbbs_tmp/

7.執行命令bin/flume-ng agent -n agent1 -c conf -f conf/example -Dflume.root.logger=DEBUG,console

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Flume原理及使用案例 Flume簡介與使用（一）——Flume安裝與配置 Flume（一）Flume原理解析 Flume安裝部署 flume安裝及配置介紹(二) flume安裝配置 flume使用場景 flume與kafka的比較 flume使用詳解 Flume使用小結 flume使用示例