1 Retrying connect to server
Flink on yarn 依賴 hadoop 集群,在沒有啟動hadoop之前,直接執行Flink啟動命令
./bin/yarn-session.sh -n 1 -jm 1024 -tm 4096
結果就是flink連不上ResourceManager,腳本一直卡在着進行重試
2018-05-19 14:36:08,062 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2018-05-19 14:36:09,231 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-05-19 14:36:10,234 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-05-19 14:36:11,235 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-05-19 14:36:12,238 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-05-19 14:36:13,240 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-05-19 14:36:14,247 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
所以,先別着急,啟動好 hadoop 環境后再啟動Flink。
2 Unable to get ClusterClient status from Application Client
hadoop 已經啟動了,這下執行 Flink 啟動命令
./bin/yarn-session.sh -n 1 -jm 1024 -tm 4096
Flink 還是沒有啟動成功
2018-05-19 15:30:10,456 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@hadoop100:55053] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 2018-05-19 15:30:21,680 WARN org.apache.flink.yarn.cli.FlinkYarnSessionCli - Could not retrieve the current cluster status. Skipping current retrieval attempt ... java.lang.RuntimeException: Unable to get ClusterClient status from Application Client at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:253) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.runInteractiveCli(FlinkYarnSessionCli.java:443) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:720) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511) Caused by: org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running. at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:862) at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248) ... 9 more Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway. at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:79) at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:857) ... 10 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at scala.concurrent.Await.result(package.scala) at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:77) ... 11 more 2018-05-19 15:30:21,691 WARN org.apache.flink.yarn.YarnClusterClient - YARN reported application state FAILED 2018-05-19 15:30:21,692 WARN org.apache.flink.yarn.YarnClusterClient - Diagnostics: Application application_1521277661809_0006 failed 1 times due to AM Container for appattempt_1521277661809_0006_000001 exited with exitCode: -103 For more detailed output, check application tracking page:http://hadoop100:8088/cluster/app/application_1521277661809_0006Then, click on links to logs of each attempt. Diagnostics: Container [pid=6386,containerID=container_1521277661809_0006_01_000001] is running beyond virtual memory limits. Current usage: 250.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1521277661809_0006_01_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 6386 6384 6386 6386 (bash) 0 0 108625920 331 /bin/bash -c /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 1> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.out 2> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.err |- 6401 6386 6386 6386 (java) 388 72 2287009792 63800 /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Failing this attempt. Failing the application. The YARN cluster has failed 2018-05-19 15:30:21,693 INFO org.apache.flink.yarn.YarnClusterClient - Sending shutdown request to the Application Master 2018-05-19 15:30:21,695 WARN org.apache.flink.yarn.YarnClusterClient - YARN reported application state FAILED 2018-05-19 15:30:21,695 WARN org.apache.flink.yarn.YarnClusterClient - Diagnostics: Application application_1521277661809_0006 failed 1 times due to AM Container for appattempt_1521277661809_0006_000001 exited with exitCode: -103 For more detailed output, check application tracking page:http://hadoop100:8088/cluster/app/application_1521277661809_0006Then, click on links to logs of each attempt. Diagnostics: Container [pid=6386,containerID=container_1521277661809_0006_01_000001] is running beyond virtual memory limits. Current usage: 250.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1521277661809_0006_01_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 6386 6384 6386 6386 (bash) 0 0 108625920 331 /bin/bash -c /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 1> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.out 2> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.err |- 6401 6386 6386 6386 (java) 388 72 2287009792 63800 /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Failing this attempt. Failing the application. 2018-05-19 15:30:21,697 INFO org.apache.flink.yarn.ApplicationClient - Sending StopCluster request to JobManager. 2018-05-19 15:30:21,726 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: hadoop100/192.168.99.100:55053 2018-05-19 15:30:21,733 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@hadoop100:55053] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://flink@hadoop100:55053]] Caused by: [Connection refused: hadoop100/192.168.99.100:55053] 2018-05-19 15:30:31,707 WARN org.apache.flink.yarn.YarnClusterClient - Error while stopping YARN cluster. java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223) at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:157) at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169) at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:169) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.ready(package.scala:169) at scala.concurrent.Await.ready(package.scala) at org.apache.flink.yarn.YarnClusterClient.shutdownCluster(YarnClusterClient.java:377) at org.apache.flink.yarn.YarnClusterClient.finalizeCluster(YarnClusterClient.java:347) at org.apache.flink.client.program.ClusterClient.shutdown(ClusterClient.java:263) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.runInteractiveCli(FlinkYarnSessionCli.java:466) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:720) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511) 2018-05-19 15:30:31,711 INFO org.apache.flink.yarn.YarnClusterClient - Deleted Yarn properties file at /tmp/.yarn-properties-root 2018-05-19 15:30:31,881 INFO org.apache.flink.yarn.YarnClusterClient - Application application_1521277661809_0006 finished with state FAILED and final state FAILED at 1521294610146 2018-05-19 15:30:31,882 WARN org.apache.flink.yarn.YarnClusterClient - Application failed. Diagnostics Application application_1521277661809_0006 failed 1 times due to AM Container for appattempt_1521277661809_0006_000001 exited with exitCode: -103 For more detailed output, check application tracking page:http://hadoop100:8088/cluster/app/application_1521277661809_0006Then, click on links to logs of each attempt. Diagnostics: Container [pid=6386,containerID=container_1521277661809_0006_01_000001] is running beyond virtual memory limits. Current usage: 250.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1521277661809_0006_01_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 6386 6384 6386 6386 (bash) 0 0 108625920 331 /bin/bash -c /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 1> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.out 2> /usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.err |- 6401 6386 6386 6386 (java) 388 72 2287009792 63800 /usr/local/jdk/bin/java -Xmx424m -Dlog.file=/usr/local/hadoop/logs/userlogs/application_1521277661809_0006/container_1521277661809_0006_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Failing this attempt. Failing the application. 2018-05-19 15:30:31,884 WARN org.apache.flink.yarn.YarnClusterClient - If log aggregation is activated in the Hadoop cluster, we recommend to retrieve the full application log using this command: yarn logs -applicationId application_1521277661809_0006 (It sometimes takes a few seconds until the logs are aggregated) 2018-05-19 15:30:31,885 INFO org.apache.flink.yarn.YarnClusterClient - YARN Client is shutting down 2018-05-19 15:30:31,909 INFO org.apache.flink.yarn.ApplicationClient - Stopped Application client. 2018-05-19 15:30:31,911 INFO org.apache.flink.yarn.ApplicationClient - Disconnect from JobManager Actor[akka.tcp://flink@hadoop100:55053/user/jobmanager#119148826]. 2018-05-19 15:30:31,916 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon. 2018-05-19 15:30:31,926 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: hadoop100/192.168.99.100:55053 2018-05-19 15:30:31,935 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@hadoop100:55053] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://flink@hadoop100:55053]] Caused by: [Connection refused: hadoop100/192.168.99.100:55053] 2018-05-19 15:30:31,935 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports. 2018-05-19 15:30:34,979 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Stopping interactive command line interface, YARN cluster has been stopped.
這種錯誤一般是 hadoop 集群資源(內存、磁盤、虛擬內存等等)不足造成的。
並且多數情況是由於分配的虛擬內存超出限制,兩種方法解決:
(1)將 hadoop 的檢查虛擬內存關閉掉。如下:
<property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property>
(2)把分配的內存調小,嘗試改為 800 即可正常啟動。這種方法不是很長久,運行一段時間,最終container還是會被kill掉。
AM Container for appattempt_1526107053244_0016_000001 exited with exitCode: -103 For more detailed output, check application tracking page:http://xxx:8099/cluster/app/application_1526107053244_0016Then, click on links to logs of each attempt. Diagnostics: Container [pid=28987,containerID=container_1526107053244_0016_01_000001] is running beyond virtual memory limits. Current usage: 366.0 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1526107053244_0016_01_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 28987 28985 28987 28987 (bash) 0 0 108650496 299 /bin/bash -c /opt/jdk/jdk1.8.0_25/bin/java -Xmx200m -Dlog.file=/opt/xxx/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner 1> /opt/xxx/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.out 2> /opt/bl07637/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.err |- 29009 28987 28987 28987 (java) 5094 780 2186571776 93395 /opt/jdk/jdk1.8.0_25/bin/java -Xmx200m -Dlog.file=/opt/xxx/hadoop/hadoop-2.7.3/logs/userlogs/application_1526107053244_0016/container_1526107053244_0016_01_000001/jobmanager.log -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.YarnApplicationMasterRunner Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Failing this attempt
3 Cannot instantiate user function
在界面submit jar 后:
org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot instantiate user function. at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:235) at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:231) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassCastException: cannot assign instance of org.apache.commons.collections.map.LinkedMap to field org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.pendingOffsetsToCommit of type org.apache.commons.collections.map.LinkedMap in instance of org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010 at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2006) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:437) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:424) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:412) at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:373) at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:220)
查了下,大部分都認為是commons-collections包沖突引起,flink用的版本比較老(3.2.2),自己有可能引入高版本的,但自己jar里面根本就沒有引用。一直沒有頭緒。
后來,發現flink老是因為資源不足掛掉,考慮是不是因為資源不足,導致flink沒有啟動完成導致的。
加大flink的啟動資源參數后,重新提交jar,完美運行。
4 Could not resolve substitution to a value: ${akka.stream.materializer}
界面上submit jar后,報:
Exception in thread "main" com.typesafe.config.ConfigException$UnresolvedSubstitution: reference.conf @ jar:file:/D:/Workspace/Work/middleware/kafka2es/target/kafka2es-0.1.0-SNAPSHOT.jar!/reference.conf: 804: Could not resolve substitution to a value: ${akka.stream.materializer} at com.typesafe.config.impl.ConfigReference.resolveSubstitutions(ConfigReference.java:108) at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142) at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379) at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312) at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398) at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142) at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379) at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312) at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398) at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142) at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379) at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312) at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398) at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142) at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379) at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312) at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398) at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142) at com.typesafe.config.impl.SimpleConfigObject$ResolveModifier.modifyChildMayThrow(SimpleConfigObject.java:379) at com.typesafe.config.impl.SimpleConfigObject.modifyMayThrow(SimpleConfigObject.java:312) at com.typesafe.config.impl.SimpleConfigObject.resolveSubstitutions(SimpleConfigObject.java:398) at com.typesafe.config.impl.ResolveContext.realResolve(ResolveContext.java:179) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:142) at com.typesafe.config.impl.ResolveContext.resolve(ResolveContext.java:231) at com.typesafe.config.impl.SimpleConfig.resolveWith(SimpleConfig.java:74) at com.typesafe.config.impl.SimpleConfig.resolve(SimpleConfig.java:64) at com.typesafe.config.impl.SimpleConfig.resolve(SimpleConfig.java:59) at com.typesafe.config.impl.SimpleConfig.resolve(SimpleConfig.java:37) at com.typesafe.config.impl.ConfigImpl$1.call(ConfigImpl.java:374) at com.typesafe.config.impl.ConfigImpl$1.call(ConfigImpl.java:367) at com.typesafe.config.impl.ConfigImpl$LoaderCache.getOrElseUpdate(ConfigImpl.java:65) at com.typesafe.config.impl.ConfigImpl.computeCachedConfig(ConfigImpl.java:92) at com.typesafe.config.impl.ConfigImpl.defaultReference(ConfigImpl.java:367) at com.typesafe.config.ConfigFactory.defaultReference(ConfigFactory.java:413) at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:307) at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:683) at akka.actor.ActorSystem$.apply(ActorSystem.scala:245) at akka.actor.ActorSystem$.apply(ActorSystem.scala:288) at akka.actor.ActorSystem$.apply(ActorSystem.scala:263) at akka.actor.ActorSystem$.create(ActorSystem.scala:191) at org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:106) at org.apache.flink.runtime.minicluster.FlinkMiniCluster.startJobManagerActorSystem(FlinkMiniCluster.scala:300) at org.apache.flink.runtime.minicluster.FlinkMiniCluster.singleActorSystem$lzycompute$1(FlinkMiniCluster.scala:329) at org.apache.flink.runtime.minicluster.FlinkMiniCluster.org$apache$flink$runtime$minicluster$FlinkMiniCluster$$singleActorSystem$1(FlinkMiniCluster.scala:329) at org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:343) at org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:341) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.Range.foreach(Range.scala:160) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:341) at org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:323) at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:107) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1501)
在自己jar工程 pom -> maven-shaded-plugin -> configuration 區域增加:
<transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>reference.conf</resource> </transformer> </transformers>
5 java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer
界面上submit jar后,報:
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: Could not run the jar. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:90) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.flink.util.FlinkException: Could not run the jar. ... 9 more Caused by: org.apache.flink.client.program.ProgramInvocationException: The program caused an error: at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:93) at org.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:334) at org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:87) at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:69) ... 8 more Caused by: java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09.setDeserializer(FlinkKafkaConsumer09.java:286) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09.<init>(FlinkKafkaConsumer09.java:213) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09.<init>(FlinkKafkaConsumer09.java:152) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010.<init>(FlinkKafkaConsumer010.java:128) at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010.<init>(FlinkKafkaConsumer010.java:112) at com.best.middleware.search.kafka2es.xngmonitor.flink.Main.main(Main.java:77) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417) at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:83) ... 11 more
NoClassDefFoundError 都知道是啥情況,編譯能通過,但運行時找不到指定的類。大致原因,網上說的很詳細。
這次的情況是yarn的lib下被其他小伙伴放了版本不一致的kafka包,導致沖突。