Flume基本操作


Flume基本操作
 
1.把Telnet產生的內容寫入到控制台顯示

bin/flume-ng agent \
-c conf \
-n a1 \
-f conf/a1.conf \
-Dflume.root.logger=DEBUG,console 


a.conf內容如下:

##### define agent name ####
a1.sources = src1
a1.channels = channel1
a1.sinks = sink1
 
####  define source  ####
a1.sources.src1.type = netcat
a1.sources.src1.bind = haoguan-HP-Compaq-Pro-6380-MT
a1.sources.src1.port = 44444
 
####  define channel  ####
a1.channels.channel1.type = memory
a1.channels.channel1.capacity = 1000
a1.channels.channel1.transactionCapacity = 100
 
####  define sink  ####
a1.sinks.sink1.type = logger
a1.sinks.sink1.maxBytesToLog = 1024
 
#### bind the source and sink to the channel
a1.sources.src1.channels = channel1
a1.sinks.sink1.channel = channel1  

 

2.把hive中產生的log寫入到hdfs

bin/flume-ng agent \
-c conf \
-n a2 \
-f conf/flume-hive.conf \
-Dflume.root.logger=DEBUG,console 

 
flume-hive.conf內容如下:

##### define agent name ####
a2.sources = src2
a2.channels = channel2
a2.sinks = sink2
 
####  define source  ####
a2.sources.src2.type = exec
a2.sources.src2.command = tail -f /opt/modules/cdh/hive-0.13.1-cdh5.3.6/log/hive.log
a2.sources.src2.shell = /bin/bash -c
 
####  define channel  ####
a2.channels.channel2.type = memory
a2.channels.channel2.capacity = 1000
a2.channels.channel2.transactionCapacity = 100
 
####  define sink  ####
a2.sinks.sink2.type = hdfs
a2.sinks.sink2.hdfs.path = hdfs://haoguan-HP-Compaq-Pro-6380-MT:9000/flume_hive_log/%Y%m%d  #可以指定時間戳作為分區目錄
a2.sinks.sink2.hdfs.filePrefix = events-
a2.sinks.sink2.hdfs.fileType = DataStream
a2.sinks.sink2.hdfs.writeFormat = Text
a2.sinks.sink2.hdfs.batchSize = 10
a2.sinks.sink2.hdfs.rollInterval = 30      #設置flush間隔,30秒flush一次,無論有沒到達rollSize大小
a2.sinks.sink2.hdfs.rollSize = 10240       #設置文件大小(byte),到指定大小flush一次,無論有沒到達rollInterval間隔
a2.sinks.sink2.hdfs.rollCount = 0          #rollCount必須設置成0,不然會影響rollInterval,rollSize的設置
a2.sinks.sink2.hdfs.idleTimeout=0
a2.sinks.sink2.hdfs.useLocalTimeStamp = true    #使用時間戳作為分區必須設置useLocalTimeStamp為true
 
#### bind the source and sink to the channel
a2.sources.src2.channels = channel2
a2.sinks.sink2.channel = channel2  

 
如果是HA架構需要把HA的core-site.xml與hdfs-site.xml放入到/opt/modules/cdh/flume-1.5.0-cdh5.3.6/conf中
a2.sinks.sink2.hdfs.path = hdfs://haoguan-HP-Compaq-Pro-6380-MT:9000/flume_hive_log
換成
a2.sinks.sink2.hdfs.path = hdfs://ns1/flume_hive_log
 
 
3. spooldir方式抽取文件到hdfs中

bin/flume-ng agent \
-c conf \
-n a3 \
-f conf/flume-app.conf \
-Dflume.root.logger=DEBUG,console 

 
flume-app.conf內容如下:

##### define agent name ####
a3.sources = src3
a3.channels = channel3
a3.sinks = sink3
 
####  define source  ####
a3.sources.src3.type = spooldir
a3.sources.src3.spoolDir = /opt/modules/cdh/flume-1.5.0-cdh5.3.6/spoollogs #指定被抽取的文件夾
a3.sources.src3.ignorePattern = ^.*\.log$  #過濾被抽取文件夾中指定的文件
a3.sources.src3.fileSuffix = _COMP         #文件抽取完成以后更改后綴
 
####  define channel  ####
a3.channels.channel3.type = file
a3.channels.channel3.checkpointDir = /opt/modules/cdh/flume-1.5.0-cdh5.3.6/filechannel/checkpoint
a3.channels.channel3.dataDirs = /opt/modules/cdh/flume-1.5.0-cdh5.3.6/filechannel/data
a3.channels.channel3.capacity = 1000
a3.channels.channel3.transactionCapacity = 100
 
####  define sink  ####
a3.sinks.sink3.type = hdfs
a3.sinks.sink3.hdfs.path = hdfs://haoguan-HP-Compaq-Pro-6380-MT:9000/flume_app_log
a3.sinks.sink3.hdfs.filePrefix = events-
a3.sinks.sink3.hdfs.fileType = DataStream
a3.sinks.sink3.hdfs.writeFormat = Text
a3.sinks.sink3.hdfs.batchSize = 10
a3.sinks.sink3.hdfs.rollInterval = 30
a3.sinks.sink3.hdfs.rollSize = 10240
a3.sinks.sink3.hdfs.rollCount = 0
a3.sinks.sink3.hdfs.idleTimeout=0
#a3.sinks.sink3.hdfs.useLocalTimeStamp = true
 
#### bind the source and sink to the channel
a3.sources.src3.channels = channel3
a3.sinks.sink3.channel = channel3  

 




免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM