Flume環境搭建_五種案例


 


Flume環境搭建_五種案例

http://flume.apache.org/FlumeUserGuide.html

A simple example

Here, we give an example configuration file, describing a single-node Flume deployment. This configuration lets a user generate events and subsequently logs them to the console.

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

This configuration defines a single agent named a1. a1 has a source that listens for data on port 44444, a channel that buffers event data in memory, and a sink that logs event data to the console. The configuration file names the various components, then describes their types and configuration parameters. A given configuration file might define several named agents; when a given Flume process is launched a flag is passed telling it which named agent to manifest.

Given this configuration file, we can start Flume as follows:

$ bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console

Note that in a full deployment we would typically include one more option: --conf=<conf-dir>. The <conf-dir> directory would include a shell script flume-env.sh and potentially a log4j properties file. In this example, we pass a Java option to force Flume to log to the console and we go without a custom environment script.

From a separate terminal, we can then telnet port 44444 and send Flume an event:

$ telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
Hello world! <ENTER>
OK

The original Flume terminal will output the event in a log message.

12/06/19 15:32:19 INFO source.NetcatSource: Source starting
12/06/19 15:32:19 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
12/06/19 15:32:34 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D          Hello world!. }

Congratulations - you’ve successfully configured and deployed a Flume agent! Subsequent sections cover agent configuration in much more detail.

 

 

 

以下為具體搭建流程

Flume搭建_案例一:單個Flume

 

 

安裝node2上

1.   上傳到/home/tools,解壓,解壓后移動到/home下

 

2.   重命名,並修改flume-env.sh

vi flume-env.sh
 
3.   配置Flume的環境變量
vi /etc/profile
source /etc/profile
查看Flume的版本,看Flume的環境變量是否配置成功
 
4.    在/home下創建tests_flume, 並創建flume配置文件
cd test_flume
vi flume1
 
5.    命令測試Flume是否安裝成功
flume-ng agent --conf /home/test_flume --conf-file /home/test_flume/flume1 --name a1 -Dflume.root.logger=INFO,console
 
安裝telnet
隨意輸入 
hi flume
切換窗口查看
 
退出 ctrl+]  quit
 

 

 

Flume搭建_案例二:兩個Flume做集群

安裝node1,node2上
  1. MemoryChanel配置
  2. capacity:默認該通道中最大的可以存儲的event數量是100
  3. trasactionCapacity:每次最大可以從source中拿到或者送到sink中的event數量也是100
  4. keep-aliveevent添加到通道中或者移出的允許時間
  5. byte**:即event的字節量的限制,只包括eventbody
1.   node1,node2,上傳壓縮包到/home/tools下,解壓,
2.     
修改conf下的flume-env.sh中的java環境變量,
3.
    在/etc/profile下
配置Flume的環境變量
4.
    node1,node2下創建測試目錄test_flume,並分別在node1,node2下創建配置文件——flume21,flume22
node1下創建flume21
 
node2下創建flume22
 
5.   node1,node2分別啟動flume(注意因為node2在后面,所以先啟動node2中flume,再啟動node1中flume)
  1. 先啟動node02Flume
  2. flume-ng agent -n a1 -c conf -f avro.conf -Dflume.root.logger=INFO,console
  3. flume-ng agent -n a1 -c conf -f /home/test_flume/flume22 -Dflume.root.logger=INFO,console
  4. 再啟動node01Flume
  5. flume-ng agent -n a1 -c conf -f simple.conf2 -Dflume.root.logger=INFO,console
  6. flume-ng agent -n a1 -c conf -f /home/test_flume/flume21 -Dflume.root.logger=INFO,console
node2:
node1:
 
6.    打開telnet測試,node2輸出結果
 

 

 

 

 

 

Flume搭建_案例三:如何監控一個文件的變化?

安裝node2上
1.   node2,上傳壓縮包到/home/tools下,解壓,
2.     
修改conf下的flume-env.sh中的java環境變量,
3.
    在/etc/profile下
配置Flume的環境變量
4.
   
node2下創建測試目錄test_flume,node2下創建配置文件——flume3
mkdir test_flume
vi flume3
 
5.    node2啟動flume
  1. 啟動Flume
  2. flume-ng agent -n a1 -c conf -f exec.conf -Dflume.root.logger=INFO,console
  3. flume-ng agent -n a1 -c conf -f /home/test_flume/flume3 -Dflume.root.logger=INFO,console
6.    測試
在/home/test_flume下創建空文件演示 touch flume.exec.log
循環添加數據
for i in {1..50}; do echo "$i hi flume" >> flume.exec.log ; sleep 0.1; done
 
 

Flume搭建_案例四: 如何監控一個文件:目錄的變化?

安裝node2上
1.   node2,上傳壓縮包到/home/tools下,解壓,
2.     
修改conf下的flume-env.sh中的java環境變量,
3.
    在/etc/profile下
配置Flume的環境變量
4.
    
node2下創建測試目錄test_flume,node2下創建配置文件——flume4
mkdir test_flume
vi flume4
 
5.    node2啟動flume
6.    測試
 
 

 

Flume搭建_案例五: 如何定義一個HDFS類型的Sink?

安裝node2上

Flume搭建_案例五_配置項解讀

1.     Flume中日期的格式
    什么時候會用?
        Flume收集的時候根據時間來創建,比如今天的產生的數據就創建20170216,昨天的就放在20170215下
!注意
 
2.     Flume是如何找到HDFS?
    Flume如果配置的是hdfs,它會根據系統中配置的環境變量去找
 
3.     Flume什么時候滾動生成新文件?
滾動的間隔,大小,數量
hdfs.rollInterval 30 Number of seconds to wait before rolling current file (0 = never roll based on time interval)
hdfs.rollSize 1024 File size to trigger roll, in bytes (0: never roll based on file size)
hdfs.rollCount 10 Number of events written to file before it rolled (0 = never roll based on number of events)

 

4.   多長時間沒有操作,Flume將一個臨時文件生成新文件?

hdfs.idleTimeout 0 Timeout after which inactive files get closed (0 = disable automatic closing of idle files)

 

5.   多長時間生成一個新的目錄?(比如每10s生成一個新的目錄)

      四舍五入,沒有五入,只有四舍

      (比如57分划分為55分,5,6,7,8,9在一個目錄,10,11,12,13,14在一個目錄)

hdfs.round false Should the timestamp be rounded down (if true, affects all time based escape sequences except %t)
hdfs.roundValue 1 Rounded down to the highest multiple of this (in the unit configured using hdfs.roundUnit), less than current time.
hdfs.roundUnit second The unit of the round down value - secondminute or hour.

 

1.   node2,上傳壓縮包到/home/tools下,解壓,
2.     
修改conf下的flume-env.sh中的java環境變量,
3.
    在/etc/profile下
配置Flume的環境變量
4.
    
node2下創建測試目錄test_flume,node2下創建配置文件——flume5
mkdir test_flume

 

vi flume5
 

 

5.    node2啟動flume
 
6.    測試
查看hdfs文件
hadoop fs -ls /flume/...
hadoop fs -get /flume/...
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM