spark-submit command-line with --files


spark提交任務

bin/spark-submit --name  Test --class com.test.batch.modeltrainer.ModelTrainerMain \ 
  --master local --files /tmp/myobject.ser --verbose  /opt/test/lib/spark-test.jar 

程序引用

val serFile = SparkFiles.get("myobject.ser")

錯誤提示

Exception: 
Exception in thread "main" java.lang.NullPointerException 
  at org.apache.spark.SparkFiles$.getRootDirectory(SparkFiles.scala:37) 
  at org.apache.spark.SparkFiles$.get(SparkFiles.scala:31) 
  at com.test.batch.modeltrainer.ModelTrainerMain$.main(ModelTrainerMain.scala:37) 
  at com.test.batch.modeltrainer.ModelTrainerMain.main(ModelTrainerMain.scala) 
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
  at java.lang.reflect.Method.invoke(Method.java:606) 
  at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) 
  at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) 
  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 

解決:

SparkEnv is an internal class that is only meant to be used within Spark. Outside of Spark, it will be null because there are no executors or driver to start an environment for. Similarly, SparkFiles is meant to be used internally (though it's privacy settings should be modified to reflect that).

只能在spark內去引用,在executors或driver去引用,在算子內

sc.parallelize(1 to 100).map { i => SparkFiles.get("my.file") }.collect()


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM