寫在前面

因為本地電腦沒裝flume，nginx各種。所以之前寫Streaming程序的時候，都是打包了放到集群上跑。就算我在程序代碼里不停地logger，調試起來也hin不方便。
於是本地寫了兩個程序，在intellj調試。
主要就是包括兩個程序：
- 一個是GenerateChar.scala用來向某個指定端口，使用socket發消息；
- 另一個就是要測試的Streaming程序了。

GenerateChar

package com.wttttt.spark

import java.io.PrintWriter
import java.net.ServerSocket

/**
  * Created with IntelliJ IDEA.
  * Description: 
  * Author: wttttt
  * Github: https://github.com/wttttt-wang/hadoop_inaction
  * Date: 2017-05-19
  * Time: 10:19
  */
object GenerateChar {
  def main(args: Array[String]) {
    val listener = new ServerSocket(9998)
    while(true){
      val socket = listener.accept()
      new Thread(){
        override def run() = {
          println("Got client connected from :"+ socket.getInetAddress)
          val out = new PrintWriter(socket.getOutputStream,true)
          while(true){
            Thread.sleep(3000)
            val context1 = "GET /result.html?Input=test1 HTTP/1.1"
            println(context1)
            val context2 = "GET /result.html?Input=test2 HTTP/1.1"
            println(context2)
            val context3 = "GET /result.html?Input=test3 HTTP/1.1"
            println(context3)
            out.write(context1 + '\n' + context2 + "\n" + context2 + "\n" + context3 + "\n" + context3 + "\n" + context3 + "\n" + context3 + "\n")
            out.flush()
          }
          socket.close()
        }
      }.start()
    }
  }
}

要發送的數據就根據需要自定義。

streaming

streaming這邊就是要調試的程序啦。

一方面是，Mater設置成local[x]，x > 1，因為這里需要receivers來接收數據。

另一方面，設置一個本地checkpoint目錄

val conf = new SparkConf()
      .setMaster("local[2]")
      .setAppName("LocalTest")
    // WARN StreamingContext: spark.master should be set as local[n], n > 1 in local mode if you have receivers to get data,
    // otherwise Spark jobs will not get resources to process the received data.
    val sc = new StreamingContext(conf, Milliseconds(5000))
    sc.checkpoint("flumeCheckpoint/")
val messages = ssc.socketTextStream("localhost", 9998)

測試的時候就各種打log，做輸出啦，hin方便噠

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 利用Pycharm本地調試spark-streaming（包含kafka和zookeeper等操作）本地調試spark程序 spark streaming (二) 如何在本地調試你的 Spark Job Spark——Spark Streaming 對比 Structured Streaming 本地idea調試spark2.x程序 idea在本地調試，spark創建hiveContext的時候報錯 spark streaming 踩過的那些坑 Spark Structured Streaming（一）基礎 Spark Structured Streaming（二）實戰