scala統計學生成績
學生的成績清單格式如下所示,第一行為表頭,各字段意思分別為學號、性別、課程名
1、課程名 2 等,后面每一行代表一個學生的信息,各字段之間用空白符隔開
Id
gender Math English Physics
301610 male 80 64 78
301611 female 65 87 58
...
給定任何一個如上格式的清單(不同清單里課程數量可能不一樣),要求盡可能采用函
數式編程,統計出各門課程的平均成績,最低成績,和最高成績;另外還需按男女同學分開,
分別統計各門課程的平均成績,最低成績,和最高成績。
測試樣例 1 如下:
Id gender Math English Physics
301610 male 80 64 78
301611 female 65 87 58
301612 female 44 71 77
301613 female 66 71 91
301614 female 70 71 100
301615 male 72 77 72
301616 female 73 81 75
301617 female 69 77 75
301618 male 73 61 65
301619 male 74 69 68
301620 male 76 62 76
301621 male 73 69 91
301622 male 55 69 61
301623 male 50 58 75
301624 female 63 83 93
301625 male 72 54 100
301626 male 76 66 73
301627 male 82 87 79
301628 female 62 80 54
301629 male 89 77 72
樣例 1 的統計結果輸出為:
course average min max
Math: 69.20 44.00 89.00
English: 71.70 54.00 87.00
Physics: 76.65 54.00 100.00
course average min max (males)
Math: 72.67 50.00 89.00
English: 67.75 54.00 87.00
Physics: 75.83 61.00 100.00
course average min max (females)
Math: 64.00 44.00 73.00
English: 77.63 71.00 87.00
Physics: 77.88 54.00 100.00
測試樣例 2
Id gender Math English Physics Science
301610 male 72 39 74 93
301611 male 75 85 93 26
301612 female 85 79 91 57
301613 female 63 89 61 62
301614 male 72 63 58 64
301615 male 99 82 70 31
301616 female 100 81 63 72
301617 male 74 100 81 59
301618 female 68 72 63 100
301619 male 63 39 59 87
301620 female 84 88 48 48
301621 male 71 88 92 46
301622 male 82 49 66 78
301623 male 63 80 83 88
301624 female 86 80 56 69
301625 male 76 69 86 49
301626 male 91 59 93 51
301627 female 92 76 79 100
301628 male 79 89 78 57
301629 male 85 74 78 80
樣例 2 的統計結果為:
course average min max
Math: 79.00 63.00 100.00
English: 74.05 39.00 100.00
Physics: 73.60 48.00 93.00
Science: 65.85 26.00 100.00
course average min max
Math: 77.08 63.00 99.00
English: 70.46 39.00 100.00
Physics: 77.77 58.00 93.00
Science: 62.23 26.00 93.00
course average min max
Math: 82.57 63.00 100.00
English: 80.71 72.00 89.00
Physics: 65.86 48.00 91.00
Science: 72.57 48.00 100.00
1 package com 2 3 object test{ 4 def main(arg:Array[String]){ 5 // 假設數據文件在當前目錄下 6 val inputFile = scala.io.Source.fromFile("C:\\Users\\hasee\\Desktop\\spark2-3-2.txt") 7 //”\\s+“是字符串正則表達式,將每行按空白字符(包括空格/制表符)分開 8 // 由於可能涉及多次遍歷,用 toList 將 Iterator 裝為 List 9 // originalData 的類型為 List[Array[String]] 10 val originalData = inputFile.getLines.map{_.split("\\s+")}.toList 11 val courseNames = originalData.head.drop(2)//獲取第一行中的課程名 12 val allStudents = originalData.tail // 去除第一行剩下的數據 13 val courseNum = courseNames.length 14 // 統計函數,參數為需要常用統計的行 15 //用到了外部變量 courseNum,屬於閉包函數 16 def statistic(lines:List[Array[String]]) = { 17 // for 推導式,對每門課程生成一個三元組,分別表示總分,最低分和最高分 18 (for(i<-2 to courseNum+1) yield{ 19 val temp = lines map { 20 elem=>elem(i).toDouble 21 } 22 (temp.sum,temp.min,temp.max) 23 })map{case (total,min,max)=>(total/lines.length,min,max)} 24 // 最后一個 map 對 for 的結果進行修改,將總分轉為平均分 25 } 26 // 輸出結果函數 27 def printResult(theresult:Seq[(Double,Double,Double)]) 28 { 29 // 遍歷前調用 zip 方法將課程名容器和結果容器合並,合並結果為二元組容器 30 (courseNames zip theresult) foreach 31 { 32 case (course,result) =>println(f"${course+":"}%-10s${result._1}%5.2f${result._2}%8.2f${result._3}%8.2f") 33 } 34 } 35 // 分別調用兩個函數統計全體學生並輸出結果 36 val allResult = statistic(allStudents) 37 println("course average min max") 38 printResult(allResult) 39 //按性別划分為兩個容器 40 val (maleLines,femaleLines) = allStudents partition{_(1)=="male"} 41 42 43 44 // 分別調用兩個函數統計男學生並輸出結果 45 val maleResult =statistic(maleLines) 46 println("course average min max") 47 printResult(maleResult) 48 // 分別調用兩個函數統計女學生並輸出結果 49 val femaleResult =statistic(femaleLines) 50 println("course average min max") 51 printResult(femaleResult) 52 53 } 54 55 }
運行結果: