教你一步搭建Flume分布式日志系統


  在前篇幾十條業務線日志系統如何收集處理?中已經介紹了Flume的眾多應用場景,那此篇中先介紹如何搭建單機版日志系統。

環境

  CentOS7.0

      Java1.8

下載

  官網下載 http://flume.apache.org/download.html

  當前最新版  apache-flume-1.7.0-bin.tar.gz

  下載后上傳到CentOS中的/usr/local/ 文件夾中,並解壓到當前文件中重命名為flume170    /usr/local/flume170

tar -zxvf apache-flume-1.7.0-bin.tar.gz

 

安裝配置

  修改 flume-env.sh 配置文件,主要是添加JAVA_HOME變量設置

JAVA_HOME=/usr/lib/jvm/java8

  設置Flume的全局變量 

  打開profile

vi /etc/profile

  添加

export FLUME=/usr/local/flume170
export PATH=$PATH:$FLUME/bin

  然后使環境變量生效

source /etc/profile

驗證是否安裝成功

flume-ng version

測試小實例

 參考網上Spool類型的示例

    Spool監測配置的目錄下新增的文件,並將文件中的數據讀取出來。需要注意兩點: 
    1) 拷貝到spool目錄下的文件不可以再打開編輯。 
    2) spool目錄下不可包含相應的子目錄 
    創建agent配置文件

# vi /usr/local/flume170/conf/spool.conf
a1.sources = r1
a1.channels = c1
a1.sinks = k1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir =/usr/local/flume170/logs
a1.sources.r1.fileHeader = true

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

  spoolDir:設置監控的文件夾,當有文件進入時會讀取文件的內容再通過sink發送,發送完成后會在文件名后加上.complete

  啟動flume agent a1

/usr/local/flume170/bin/flume-ng agent -c . -f /usr/local/flume170/conf/spool.conf -n a1 -Dflume.root.logger=INFO,console

  追加一個文件到/usr/local/flume170/logs目錄

# echo "spool test1" > /usr/local/flume170/logs/spool_text.log

  在控制台,可以看到以下相關信息:

14/08/10 11:37:13 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
14/08/10 11:37:13 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.
14/08/10 11:37:14 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /usr/local/flume170/logs/spool_text.log to/usr/local/flume170/logs/spool_text.log.COMPLETED 14/08/10 11:37:14 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown. 14/08/10 11:37:14 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown. 14/08/10 11:37:14 INFO sink.LoggerSink: Event: { headers:{file=/usr/local/flume170/logs/spool_text.log} body: 73 70 6F 6F 6C 20 74 65 73 74 31 spool test1 } 14/08/10 11:37:15 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown. 14/08/10 11:37:15 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown. 14/08/10 11:37:16 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown. 14/08/10 11:37:16 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown. 14/08/10 11:37:17 INFO source.SpoolDirectorySource: Spooling Directory Source runner has shutdown.

   出現上面的內容就表明已經可以運行了,整個安裝過程很簡單,主要是配置。

   至於分布式的需要設置source和sink。

 

  如上圖,將每個業務中的Flume產生的日志再用一個Flume來接收匯總,然后將匯總后的日志統一發送給KafKa作統一處理,最后保存到HDFS或HBase中。上圖中,每個業務中的Flume可以做負載和主備,由此可以看出有很強的擴展性。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM