spark-submit command-line with --files


spark提交任务

bin/spark-submit --name  Test --class com.test.batch.modeltrainer.ModelTrainerMain \ 
  --master local --files /tmp/myobject.ser --verbose  /opt/test/lib/spark-test.jar 

程序引用

val serFile = SparkFiles.get("myobject.ser")

错误提示

Exception: 
Exception in thread "main" java.lang.NullPointerException 
  at org.apache.spark.SparkFiles$.getRootDirectory(SparkFiles.scala:37) 
  at org.apache.spark.SparkFiles$.get(SparkFiles.scala:31) 
  at com.test.batch.modeltrainer.ModelTrainerMain$.main(ModelTrainerMain.scala:37) 
  at com.test.batch.modeltrainer.ModelTrainerMain.main(ModelTrainerMain.scala) 
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
  at java.lang.reflect.Method.invoke(Method.java:606) 
  at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) 
  at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) 
  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 

解决:

SparkEnv is an internal class that is only meant to be used within Spark. Outside of Spark, it will be null because there are no executors or driver to start an environment for. Similarly, SparkFiles is meant to be used internally (though it's privacy settings should be modified to reflect that).

只能在spark内去引用,在executors或driver去引用,在算子内

sc.parallelize(1 to 100).map { i => SparkFiles.get("my.file") }.collect()


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM