解決flume運行中的一個異常問題！

本文轉載自查看原文 2017-02-20 17:47 5039 kafka

今天在本地測試flume的exec 監控文件分割的問題！！！遇到各種141異常問題！

懷疑是在切割文件的時候超過了監控文本的時間，導致flume異常退出，，，所以增加了keep-alive 時長，，，他的默認值是3秒，，我把它設置為30秒，，，之后運行，，，，他不再異常！！！

解決：設置agent1.channels.<channel_name>.keep-alive = 30

參考文章：問題2，，，，雖然前邊的agent，方式可能不一樣，但是這個關鍵的時間是一樣的。

-------------------------------------------------以下是原文，原文地址：http://www.tuicool.com/articles/mmm2AvF

flume 問題分析與處理

時間 2014-03-30 11:50:18 CSDN博客

原文 http://blog.csdn.net/wangqiaoshi/article/details/22577975

主題 Flume

問題一：

org.apache.flume.EventDeliveryException:Failed to send events

atorg.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:382)

atorg.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)

at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)

at java.lang.Thread.run(Thread.java:722)

Caused by:org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host:10.95.198.123, port: 44444 }: Failed to send batch

at org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:294)

atorg.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:366)

... 3 more

Caused by:org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host:10.95.198.123, port: 44444 }: Avro RPC call returned Status: FAILED

atorg.apache.flume.api.NettyAvroRpcClient.waitForStatusOK(NettyAvroRpcClient.java:370)

分析：

代碼分析

try {
appendBatch(events, requestTimeout, TimeUnit.MILLISECONDS);
} catch (Throwable t) {
// we mark as no longer active without trying to clean up resources
// client is required to call close() to clean up resources
setState(ConnState.DEAD);
if (t instanceof Error) {
throw (Error) t;
}
if (t instanceof TimeoutException) {
throw new EventDeliveryException(this + ": Failed to send event. " +
"RPC request timed out after " + requestTimeout + " ms", t);
}
throw new EventDeliveryException(this + ": Failed to send batch", t);

請求超時,導致發送event失敗

解決：

設置request-timeout長一點，默認20秒

問題二：

org.apache.flume.ChannelException: Unableto put batch on required channel: org.apache.flume.channel.MemoryChannel{name:woStoreSoftWDownloadC2}

atorg.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)

atorg.apache.flume.source.ExecSource$ExecRunnable.flushEventBatch(ExecSource.java:376)

atorg.apache.flume.source.ExecSource$ExecRunnable.run(ExecSource.java:336)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

at java.util.concurrent.FutureTask.run(FutureTask.java:166)

atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

at java.lang.Thread.run(Thread.java:722)

Caused by:org.apache.flume.ChannelException: Space for commit to queue couldn't beacquired Sinks are likely not keeping up with sources, or the buffer size istoo tight

atorg.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:128)

atorg.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)

atorg.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:192)

... 8 more

30 Mar 2014 10:16:00,960 ERROR[timedFlushExecService18-0](org.apache.flume.source.ExecSource$ExecRunnable$1.run:322) - Exception occured when processing eventbatch

org.apache.flume.ChannelException: Unableto put batch on required channel: org.apache.flume.channel.MemoryChannel{name:woStoreSoftWDownloadC2}

atorg.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)

atorg.apache.flume.source.ExecSource$ExecRunnable.flushEventBatch(ExecSource.java:376)

atorg.apache.flume.source.ExecSource$ExecRunnable.access$100(ExecSource.java:249)

at org.apache.flume.source.ExecSource$ExecRunnable$1.run(ExecSource.java:318)

atjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

atjava.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)

代碼分析：

protected void doCommit() throws InterruptedException {
int remainingChange = takeList.size() - putList.size();
if(remainingChange < 0) {
if(!queueRemaining.tryAcquire(-remainingChange, keepAlive, TimeUnit.SECONDS)) {
throw new ChannelException("Space for commit to queue couldn't be acquired" +
" Sinks are likely not keeping up with sources, or the buffer size is too tight");
}
}
int puts = putList.size();
int takes = takeList.size();
synchronized(queueLock) {
if(puts > 0 ) {
while(!putList.isEmpty()) {
if(!queue.offer(putList.removeFirst())) {
throw new RuntimeException("Queue add failed, this shouldn't be able to happen");
}
}
}
putList.clear();
takeList.clear();
}
queueStored.release(puts);
if(remainingChange > 0) {
queueRemaining.release(remainingChange);
}
if (puts > 0) {
channelCounter.addToEventPutSuccessCount(puts);
}
if (takes > 0) {
channelCounter.addToEventTakeSuccessCount(takes);
}
channelCounter.setChannelSize(queue.size());
}

等待keep-alive之后，還是沒辦法插入event

解決方案：

設置keep-alive(默認3秒) , capacity(100), transactionCapacity(100)大一點

-----------------------------------------------------------------------------------------------

之后，前邊的異常少了，，后邊的異常是，，，產生消息快，命令消費慢！！！導致管道滿！！！！

這個就需要對端加大管道的消費速度，來調整？？？，，，，，懷疑之，，，然后增大空間和傳輸空間！

-----------更改結果如下！

stage_nginx.channels.M1.capacity = 1000
stage_nginx.channels.M1.transactionCapacity = 1000
stage_nginx.channels.M1.keep-alive = 30

---------

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 一個flume agent異常的解決過程記錄 flume運行問題及測試 flume file channel 異常解決一個Flume 異常（Put queue for MemoryTransaction of capacity 100 full）的排查和解決思路 flume常見異常匯總以及解決方案安裝Flume遇到的問題及解決 visual studio2017只有一個.dll文件，如何成功調用，並解決錯誤代碼14001以及運行時異常問題 flume 使用遇到問題及解決 flume部署問題解決 flume在運行中常見的問題及處理措施