spark sql中保存数据的几种方式

本文转载自查看原文 2015-05-18 14:35 5340 Spark

从官网来copy过来的几种模式描述：

Scala/Java	Python	Meaning
`SaveMode.ErrorIfExists`(default)	`"error"`(default)	When saving a DataFrame to a data source, if data already exists, an exception is expected to be thrown.
`SaveMode.Append`	`"append"`	When saving a DataFrame to a data source, if data/table already exists, contents of the DataFrame are expected to be appended to existing data.
`SaveMode.Overwrite`	`"overwrite"`	Overwrite mode means that when saving a DataFrame to a data source, if data/table already exists, existing data is expected to be overwritten by the contents of the DataFrame.
`SaveMode.Ignore`	`"ignore"`	Ignore mode means that when saving a DataFrame to a data source, if data already exists, the save operation is expected to not save the contents of the DataFrame and to not change the existing data. This is similar to a `CREATE TABLE IF NOT EXISTS` in SQL.

ErrorIfExists就是出现错误后，抛出错误

Append顾名思义，就是追加信息

Overwrite是覆盖

Ignore是如果存在则忽略

另外，如果没有指定存储模式，那么默认应该是SaveMode.ErrorIfExists，因为我重复保存后报了：already exists错误来着。

如何使用：

import org.apache.spark.sql._
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val df = sqlContext.load("/opt/modules/spark1.3.1/examples/src/main/resources/people.json")
 df.save("/opt/test/1","json", SaveMode.Overwrite)  //可以把SaveMode.Overwrite换成其他的几种形式喽

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 spark 数据读取与保存 load、save方法、spark sql的几种数据源 Spark实战(八)spark的几种启动方式 spark创建DataFrame的几种方式 Spark基础：（四）Spark 数据读取与保存 sql 几种循环方式 spark DataFrame 读写和保存数据 spark保存数据到hdfsJ及hive 数据保存(永久保存)方式 spark实现wordcount的几种方式总结