spark sql中保存數據的幾種方式

本文轉載自查看原文 2015-05-18 14:35 5340 Spark

從官網來copy過來的幾種模式描述：

Scala/Java	Python	Meaning
`SaveMode.ErrorIfExists`(default)	`"error"`(default)	When saving a DataFrame to a data source, if data already exists, an exception is expected to be thrown.
`SaveMode.Append`	`"append"`	When saving a DataFrame to a data source, if data/table already exists, contents of the DataFrame are expected to be appended to existing data.
`SaveMode.Overwrite`	`"overwrite"`	Overwrite mode means that when saving a DataFrame to a data source, if data/table already exists, existing data is expected to be overwritten by the contents of the DataFrame.
`SaveMode.Ignore`	`"ignore"`	Ignore mode means that when saving a DataFrame to a data source, if data already exists, the save operation is expected to not save the contents of the DataFrame and to not change the existing data. This is similar to a `CREATE TABLE IF NOT EXISTS` in SQL.

ErrorIfExists就是出現錯誤后，拋出錯誤

Append顧名思義，就是追加信息

Overwrite是覆蓋

Ignore是如果存在則忽略

另外，如果沒有指定存儲模式，那么默認應該是SaveMode.ErrorIfExists，因為我重復保存后報了：already exists錯誤來着。

如何使用：

import org.apache.spark.sql._
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val df = sqlContext.load("/opt/modules/spark1.3.1/examples/src/main/resources/people.json")
 df.save("/opt/test/1","json", SaveMode.Overwrite)  //可以把SaveMode.Overwrite換成其他的幾種形式嘍

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 spark 數據讀取與保存 load、save方法、spark sql的幾種數據源 Spark實戰(八)spark的幾種啟動方式 spark創建DataFrame的幾種方式 Spark基礎：（四）Spark 數據讀取與保存 sql 幾種循環方式 spark DataFrame 讀寫和保存數據 spark保存數據到hdfsJ及hive 數據保存(永久保存)方式 spark實現wordcount的幾種方式總結