pyspark之创建SparkSession


1.from pyspark.sql import SparkSession

2.spark = SparkSession.builder.master("spark://master:7077") \
.appName('compute_customer_age') \
.config('spark.executor.memory','2g') \
.enableHiveSupport() \
.getOrCreate()

3.创建完毕

4.可以用于构建DataFrame|用于访问hive

4.1DataFrame

documentDF = spark.createDataFrame([
("Hi I heard about Spark".split(" "), ),
("I wish Java could use case classes".split(" "), ),
("Logistic regression models are neat".split(" "), )
], ["text"])

4.2访问hive

sql = """
"""
df = spark.sql(sql)
df.show()

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM