Spark:DataFrame數據透視函數pivot


使用數據透視函數pivot:

val list = List(
  (2017, 1, 100), 
  (2017, 1, 50), 
  (2017, 2, 100), 
  (2017, 3, 50), 
  (2018, 2, 200), 
  (2018, 2, 100))
import spark.implicits._
val ds = spark.createDataset(list)
val df = ds.toDF("year", "month", "num")
val res:org.apache.spark.sql.DataFrame = 
  df.groupBy("year")
    .pivot("month")
    .sum("num")

df.show
+----+-----+---+
|year|month|num|
+----+-----+---+
|2017|    1|100|
|2017|    1| 50|
|2017|    2|100|
|2017|    3| 50|
|2018|    2|200|
|2018|    2|100|
+----+-----+---+

res.show
+----+----+---+----+
|year|   1|  2|   3|
+----+----+---+----+
|2018|null|300|null|
|2017| 150|100|  50|
+----+----+---+----+


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM