[圖解tensorflow源碼] TF系統概述篇

本文轉載自查看原文 2016-08-15 15:40 14102 tensorflow源碼分析

Rendezvous

1. 定義在core/framework/rendezvous.h

2. A Rendezvous is an abstraction for passing a Tensor from a producer to a consumer where the consumer may safely request the Tensor before or after it has been produced. A producer never blocks

when using a Rendezvous. A consumer has the choice of making a blocking call or providing a callback: in either case,

the consumer receives the Tensor as soon as it is available.

(簡而言之：Nonblocking send, blocking receive)

3. A Rendezvous key encodes a single <producer, consumer> pair. It is an error to call Send() or Recv*() more than once with the same key.

4. 在消息通信機制中，消息傳遞涉及到信箱容量問題。一個極端的情況是信箱容量為0，那么，當send在receive之前執行的話，則發送進程被阻塞，直到receive做完。

執行receive時信件可從發送者直接拷貝到接收者，不用任何中間緩沖。類似的，如果receive先被執行，接受者將被阻塞直到send發生。上述策略稱為回合（rendezvous）原則。

5. tensorflow 的消息傳遞屬於【發送不阻塞，接收阻塞】，實現場景有以下兩種：

> LocalRendezvous （本地消息傳遞）

> RpcRemoteRendezvous (分布式消息傳遞)

> 另外一種特殊的通信形式是IntraProcessRendezvous (rendezvous_mgr.h)，用於本地不同設備間通信。

Buffering of Tensor values is delegated to a "local" Rendezvous obtained from NewLocalRendezvous().

This class just adds functionality to coordinate multiple process-local devices.

6. 在Op Kernels中，有SendOp和RecvOp兩個類(kernels/sendrecv_ops.h)，與Rendezvous結合使用。

7. 【Each node:port specified in inputs is replaced with a feed node, which will pick up the provided input tensor from specially-initialized entries in a Rendezvous object used for the Run call】(from tensorflow white paper)

minix消息傳遞中rendezvous概念

符號編程

MXNet設計筆記之：深度學習的編程模式比較

命令式（靈活）

符號式（高效）

前向計算圖（顯式） + 反向計算圖（隱式）

Session

## 說明：A Session instance lets a caller drive a TensorFlow graph computation
## relate files: /public/session.h, /comm_rt/[[session.cc session_factory.cc session_factory.h session_options.cc session_state.cc]]
## 客戶程序通過會話（Session）與TensorFlow系統進行交互。在Session建立時運算流圖初始狀態為空圖。為創建運算流圖，TensorFlow通過Session接口的

Extend函數，把額外的節點和邊擴充到當前的運算流圖中。

Run()是Session接口中另一個重要的函數。Run()函數的參數包括最終運算輸出的變量名，及運算流圖中涉及到的張量運算集。

為得到所期望的輸出結果，運行過程中TensorFlow對所有節點進行傳遞閉包運算。並遵照節點間的運算依賴關系進行排序（具體細節將在3.1節中介紹）。

在大部分的TensorFlow應用中，一般構建一次Session，然后通過調用Run()對整個運算流圖或是部分獨立的子圖進行多次運算。

表（1）TensorFlow核心庫中的部分運算

附：傳遞閉包：即在數學中，在集合 X 上的二元關系 R 的傳遞閉包是包含 R 的 X 上的最小的傳遞關系。

設備及內存分配

tensorflow設備內存分配算法解析

1. tensorflow設備內存管理模塊實現了一個best-fit with coalescing （ bfc）算法

> bfc選擇合適內存塊的原則是：找到chunk size大於等於x的最小的那個空閑內存塊

2. 每個 worker 負責一個或者多個設備，每個設備有一個設備類型和一個名字。設備名字由識別設備類型的部分，在 worker 中的設備索引，以及在分布式設定中，worker 的 job和任務（或者 localhost 當設備是和進程在同一機器時）的標志構成。一些例子如/job:localhost/device:cpu:0 或者 /job:worker/task:17/device:gpu:3。每個設備對象負責管理分配和解除分配設備內存，對在 TensorFlow 實現中的更高層請求任意 kernel 的執行調度管理。

3. tensorflow中，基類Device的子類有【GPUDevice, CPUDevice(即ThreadPoolDevice)， GPUCompatibleCPUDevice】

Graph

Graph describes a set of computations that are to be performed, as well as the dependencies between those computations. The basic model is a DAG (directed acyclic graph) with

* internal nodes representing computational operations to be performed;

* edges represent dependencies, indicating the target may only be executed once the source has completed;

> 正常邊，正常邊上可以流動數據，即正常邊就是tensor

> 特殊邊，又稱作控制依賴，(control dependencies)

* predefined "source" (start) and "sink" (finish) nodes -- the source should be the only node that doesn't depend on anything, and the sink should be the only node that nothing depends on.

* graph優化：

Common Subexpression Elimination （CSE，公共子表達式消除）

如果一個表達式E已經計算過了，並且從先前的計算到現在的E中的變量都沒有發生變化，那么E的此次出現就成為了公共子表達式。
例如：x=(a+c)*12+(c+a)*2; 可優化為 x=E*14

參考：編譯器常用優化方法 Introduction to Compilers

* Control flow graph ( CFG)

A control flow graph (CFG) in computer science is a representation, using graph notation, of all paths that might be traversed through a program during its execution.

Gradients