Flume 文檔:https://flume.apache.org/FlumeUserGuide.html
Flume 下載:https://archive.apache.org/dist/flume/ & https://flume.apache.org/download.html
JDK 下載:https://mirrors.huaweicloud.com/java/jdk/
Flume 不是一個分布式程序,也不需要啟動什么進程。在有任務時,運行程序,指定任務即可。
一、安裝
# 下載 curl -o /opt/apache-flume-1.9.0-bin.tar.gz http://mirrors.tuna.tsinghua.edu.cn/apache/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz # 解壓 tar -zxf /opt/apache-flume-1.9.0-bin.tar.gz -C /opt/ # 配置 cd /opt/apache-flume-1.9.0-bin/conf/ cp flume-env.sh.template flume-env.sh vim flume-env.sh
flume-env.sh 改一個 JDK 路徑即可
# Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced # during Flume startup. # Enviroment variables can be set here. export JAVA_HOME=/opt/jdk1.8.0_202 # Give Flume more memory and pre-allocate, enable remote monitoring via JMX # export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote" # Let Flume write raw event data and configuration information to its log files for debugging # purposes. Enabling these flags is not recommended in production, # as it may result in logging sensitive user information or encryption secrets. # export JAVA_OPTS="$JAVA_OPTS -Dorg.apache.flume.log.rawdata=true -Dorg.apache.flume.log.printconfig=true " # Note that the Flume conf directory is always included in the classpath. #FLUME_CLASSPATH=""
二、HelloWorld
https://flume.apache.org/FlumeUserGuide.html#a-simple-example
監控端口數據官方案例:使用 Flume 監聽一個端口,收集該端口數據,並打印到控制台。
1.使用 nc 做為網絡通信
yum install -y nc # 服務端,接受消息 nc -lk 4444 # 客戶端,發送消息 nc 127.0.0.1 4444
2.編寫 Flume Agent 配置文件
https://flume.apache.org/FlumeUserGuide.html#netcat-tcp-source
https://flume.apache.org/FlumeUserGuide.html#netcat-udp-source
flume-netcat-logger.conf
# Name the components on this agent # a1:表示 agent 的名稱 # r1:表示 a1 的 Source 的名稱 a1.sources = r1 # k1:表示 a1 的 Sink 的名稱 a1.sinks = k1 # c1:表示 a1 的 Channel 的名稱 a1.channels = c1 # Describe/configure the source # 表示 a1 的輸入源類型為 netcat 端口類型 a1.sources.r1.type = netcat # 表示 a1 的監聽的主機 a1.sources.r1.bind = 127.0.0.1 # 表示 a1 的監聽的端口號 a1.sources.r1.port = 4444 # Describe the sink # 表示 a1 的輸出目的地是控制台 logger 類型 a1.sinks.k1.type = logger # Use a channel which buffers events in memory # 表示 a1 的 channel 類型是 memory 內存型 a1.channels.c1.type = memory # 表示 a1 的 channel 總容量 1000 個 event a1.channels.c1.capacity = 1000 # 表示 a1 的 channel 傳輸時收集到了 100 條 event 以后再去提交事務 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel # 表示將 r1 和 c1 連接起來 a1.sources.r1.channels = c1 # 表示將 k1 和 c1 連接起來 a1.sinks.k1.channel = c1
3.開啟 Flume 監聽端口
cd /opt/apache-flume-1.9.0-bin/ # 第一種寫法 bin/flume-ng agent --conf conf/ --name a1 --conf-file /tmp/flume-netcat-logger.conf -Dflume.root.logger=INFO,console # 第二種寫法 bin/flume-ng agent -c conf/ -n a1 -f /tmp/flume-netcat-logger.conf -Dflume.root.logger=INFO,console # --conf/-c:表示配置文件存儲在 conf/目錄 # --name/-n:表示給 agent 起名為 a1 # --conf-file/-f:flume 本次啟動讀取的配置文件是在 /tmp 文件夾下的 flume-telnet.conf 文件。 # -Dflume.root.logger=INFO,console :-D 表示 flume 運行時動態修改 flume.root.logger 參數屬性值,並將控制台日志打印級別設置為 INFO 級別。日志級別包括:log、info、warn、error。
4.向監聽的網絡端口發送數據
nc 127.0.0.1 4444
再看 Flume 監控日志