Flink開發_Flink的概念理解

本文轉載自查看原文 2020-12-03 20:44 348 Flink

1.Model level

###1. DataStream API  
    use Data Source
     environment.fromSource(
         Source<OUT, ?, ?> source,
         WatermarkStrategy<OUT> timestampsAndWatermarks,
         String sourceName)
	 StreamExecutionEnvironment.addSource(sourceFunction).

###2.DataSet API
       DataSet Transformations

###3.Table API & SQL
    使用Java開發 依賴
	   flink-table-common
	   flink-table-api-java-bridge
	   flink-table-planner-blink
	   flink-table-runtime-blink
	 引入：
	    org.apache.flink.table.api.Table;
        org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
		org.apache.flink.table.api.bridge.java.BatchTableEnvironment
	 Flink 1.11 引入了新的 Table Source 和 Sink 接口（即  DynamicTableSource 和 DynamicTableSink ），
		    org.apache.flink.table.connector.source
			org.apache.flink.table.connector.sink
		  這一接口可以統一批作業和流作業

2.Data Types

  Supported Data Types
  Type handling
  Creating a TypeInformation or TypeSerializer

 Data Types in the Table API
    org.apache.flink.table.types.DataType within the Table API 
	   or when defining connectors, catalogs, 
	   or user-defined functions.

3.Connector

從數據講，有三類connector
  DataStream Connectors
  DataSet Connectors
  Table & SQL Connectors
作用：
   01.DataStream Connectors
     Predefined Sources and Sinks
	 Bundled Connectors
	 Connectors in Apache Bahir
	Other Ways to Connect to Flink
	  Data Enrichment via Async I/O
	  Queryable State
   02.DataSet Connectors
       file systems
	    other systems using Input/OutputFormat wrappers for Hadoop
   03.Table & SQL Connectors ： register table sources and table sinks
   Flink’s table connectors.
	     User-defined Sources & Sinks  ==  develop a custom, user-defined connector.
         Metadata  Planning  Runtime
		  實現：
		     Dynamic Table Source    Dynamic Table Factories
			 Dynamic Table Sink     Encoding / Decoding Formats
 Predefined Sources and Sinks
   1.pre-defined source connectors
          自定義的Source   SourceOperators 
   flink-core
       org.apache.flink.api.connector.source.SourceSplit
       org.apache.flink.api.connector.source.SourceReader
       org.apache.flink.api.connector.source.SplitEnumerator
       org.apache.flink.api.connector.source.event.NoMoreSplitsEvent
       自定義一個新的 數據源或者理解Fink的數據源的原理
  Sources and sinks are often summarized under the term connector.

4.Refactor Source Interface

. Data Source API

  Flink提供的Source - Data Source API
 01. A Data Source has three core components: 
	    Splits , the SplitEnumerator, and the SourceReader.
         在有界或者批處理的情況下，
	     the enumerator generates a fix set of splits, and each split is necessarily finite. 
		 讀取完成后，會返回 NoMoreSplits ，即 有限的splits，且每一個 split是有界的
	  在無界的流處理情況下
	      one of the two is not true (splits are not finite, or the enumerator keep generating new splits).
	例如：
	  Bounded File Source
	  Unbounded Streaming File Source
	    SplitEnumerator 不對 NoMoreSplits做回應，且周期的查看內容 
 02.The Source API is  工廠模式的接口來創建以下組件
     Split Serializer
        Split Enumerator
        Enumerator Checkpoint Serializer
        Source Reader                  消費來自Split的消息
03.
  The SplitReader is the high-level API for simple synchronous reading/polling-based source implementations,
   SourceReaderBase
   SplitFetcherManager
   數據源的Event Time and Watermarks ，不要使用老的函數了，因為數據源已經assigned

2. Data Source Function

  01.預定義的 Source 和 Sink，
       (內置在 Flink 里 直接使用，一般用於調試驗證等，不需要引入外部依賴)
    pre-implemented source functions,
    File-based
    Socket-based
    Collection-based
   02.Connectors provide code for interfacing with various third-party systems
    連接器可以和多種多樣的第三方系統進行交互
    001.Flink里已經提供了一些綁定的Connector(需要 將相應的connetor相關類打包進)
       public abstract class KafkaDynamicSinkBase implements DynamicTableSink 
	   public interface ScanTableSource extends DynamicTableSource 
	   org.apache.flink.table.connector.sink.DynamicTableSink
	   org.apache.flink.table.connector.source.DynamicTableSource
    002.Apache Bahir中的連接器
   03.Flink 提供了異步 I/O API  連接Fink,一般用於訪問外部數據庫
   異步I/O可以並發處理多個請求，提高吞吐，減少延遲

04.可查詢狀態

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 「Flink」理解流式處理重要概念 [Flink] Flink的waterMark的通俗理解 Apache Flink - 基本API概念 Flink基本概念【Flink】流-表概念 Flink 理解Flink之三Transformation 理解flink connector 理解Flink之四State 理解Flink之一編譯Flink-1.11.1