Apache Flink聞名已久,一直沒有親自嘗試一把,這兩天看了文檔,發現在real-time streaming方面,Flink提供了更多高階的實用函數。
用Apache Flink實現WordCount
- 下載Apache Flink 0.10.1
- 啟動local模式
bin/start-local.sh
- 運行scala-shell
bin/start-scala-shell.sh remote localhost 6123
Flink中JobManager的默認監聽端口是6123
- wordcount
val text = env.fromElements("Whether The slings and arrows of outrageous fortune")
val counts = text.flatMap{ _.toLowerCase.split("\\W+")}.map{ (_,1)}.groupBy(0).sum(1)
counts.print