Tensorflow[源碼安裝時bazel行為解析]

本文轉載自查看原文 2018-08-06 10:45 13982 Tensorflow

0. 引言

通過源碼方式安裝，並進行一定程度的解讀，有助於理解tensorflow源碼，本文主要基於tensorflow v1.8源碼，並借鑒於如何閱讀TensorFlow源碼.

首先，自然是需要去bazel官網了解下必備知識，如(1)什么是bazel; (2)bazel如何對cpp項目進行構建的; (3)bazel構建時候的函數大全。然后就是bazel官網的一些其他更細節部分了。下文中會給出超鏈接。

ps: 找了很久，基本可以確定bazel除了官網是沒有如書籍等資料出現的，所以只有官網和別人博客這2個途徑進行學習了解
因為bazel官網很多鏈接不在左邊的導航中，所以推薦直接將整個網站鏡像下來

wget -m -c -x -np -k -E -p https://docs.bazel.build/versions/master/bazel-overview.html

1. 從源碼編譯tensorflow

如下圖所示：

圖1.1 github上tensorflow v1.8源碼目錄

1.1 先配置

源代碼樹的根目錄中包含了一個名為 configure 的 bash 腳本。此腳本會要求您確定所有相關 TensorFlow 依賴項的路徑名，並指定其他構建配置選項，例如編譯器標記。您必須先運行此腳本，然后才能創建 pip 軟件包並安裝 TensorFlow
然后是運行該configure

./configure

$ cd tensorflow  # cd to the top-level directory created
$ ./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7    # python解釋器路徑
Found possible Python library paths:
 /usr/local/lib/python2.7/dist-packages
 /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python2.7/dist-packages]       # python 庫路徑

Using python library path: /usr/local/lib/python2.7/dist-packages
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:    # 是否在編譯期間啟用優化
Do you wish to use jemalloc as the malloc implementation? [Y/n]        # 是否將 jemalloc 作為malloc的實現
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]    # 是否開啟google雲平台支持
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N]    # 是否開啟hdfs的支持
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]    # 是否啟用尚在實驗性質的XLA jit編譯
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N]    # 是否開啟VERBS支持
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N]    # 是否開啟OpenCL支持
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] Y    # 是否開啟CUDA支持
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N]    # 是否將clang作為CUDA的編譯器
nvcc will be used as CUDA compiler
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.0    # 選擇cuda版本
Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:    # 告知cuda的安裝路徑
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:    # 指定host側的 編譯器
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7    # cuDNN版本
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:    # 告知cuDNN 的安裝路徑
Please specify a list of comma-separated CUDA compute capabilities you want to build with.     # 告知當前機器上GPU的計算力
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.    
Please note that each additional compute capability significantly increases your build time and binary size.

Do you wish to build TensorFlow with MPI support? [y/N]    # 是否開啟MPI支持
MPI support will not be enabled for TensorFlow
Configuration finished

我們先來看看configure到底做了什么事情，

#!/usr/bin/env bash

set -e
set -o pipefail

if [ -z "$PYTHON_BIN_PATH" ]; then
  PYTHON_BIN_PATH=$(which python || which python3 || true)
fi

# Set all env variables
CONFIGURE_DIR=$(dirname "$0")
"$PYTHON_BIN_PATH" "${CONFIGURE_DIR}/configure.py" "$@"    #  這行表明該configure文件是通過調用 對應的configure.py來完成配置過程的

echo "Configuration finished"

從configure.py的第1491行開始，發現如上述運行代碼中展示的配置過程

  set_build_var(environ_cp, 'TF_NEED_JEMALLOC', 'jemalloc as malloc',
                'with_jemalloc', True)
  set_build_var(environ_cp, 'TF_NEED_GCP', 'Google Cloud Platform',
                'with_gcp_support', True, 'gcp')
  set_build_var(environ_cp, 'TF_NEED_HDFS', 'Hadoop File System',
                'with_hdfs_support', True, 'hdfs')
  set_build_var(environ_cp, 'TF_NEED_AWS', 'Amazon AWS Platform',
                'with_aws_support', True, 'aws')
  set_build_var(environ_cp, 'TF_NEED_KAFKA', 'Apache Kafka Platform',
                'with_kafka_support', True, 'kafka')
  set_build_var(environ_cp, 'TF_ENABLE_XLA', 'XLA JIT', 'with_xla_support',
                False, 'xla')
  set_build_var(environ_cp, 'TF_NEED_GDR', 'GDR', 'with_gdr_support',
                False, 'gdr')
  set_build_var(environ_cp, 'TF_NEED_VERBS', 'VERBS', 'with_verbs_support',
                False, 'verbs')

所以配置過程可以簡單的理解，就是各種參數的收集，最后會有3個文件的時間信息更新（即生成或者修改的）：

其中.bazelrc內容如下:

import /mnt/d/tensorflow/tensorflow-master/.tf_configure.bazelrc

即導入的是在當前文件夾下新生成的文件.tf_configure.bazelrc，而該文件就紀錄了配置

build --action_env PYTHON_BIN_PATH="/home/shouhuxianjian/anaconda3/bin/python"
build --action_env PYTHON_LIB_PATH="/home/shouhuxianjian/anaconda3/lib/python3.6/site-packages"
build --python_path="/home/shouhuxianjian/anaconda3/bin/python"
build --define with_jemalloc=true
build:gcp --define with_gcp_support=true
build:hdfs --define with_hdfs_support=true
build:aws --define with_aws_support=true
build:kafka --define with_kafka_support=true
build:xla --define with_xla_support=true
build:gdr --define with_gdr_support=true
build:verbs --define with_verbs_support=true
build --action_env TF_NEED_OPENCL_SYCL="0"
build --action_env TF_NEED_CUDA="0"
build --action_env TF_DOWNLOAD_CLANG="0"
build --define grpc_no_ares=true
build:opt --copt=-march=native
build:opt --host_copt=-march=native
build:opt --define with_default_optimizations=true
build --strip=always

其中的build:hdfs等形式等效於build --config=hdfs ，見這里的--config
上述在hdfs,gcp,aws,kafka選擇時點擊了N，如果點擊Y則會變換成如下形式:

build --define with_gcp_support=true
build --define with_hdfs_support=true
build --define with_aws_support=true
build --define with_kafka_support=true

可以發現和

build --define with_jemalloc=true

一樣了。而對於bazel而言，如果build:package形式，則編譯時候會忽略該包(hdfs包中BUILD內容為：

# 文檔在 tensorflow-master/third_party/hadoop/BUILD
package(default_visibility = ["//visibility:public"])

licenses(["notice"])  # Apache 2.0

exports_files(["LICENSE.txt"])

cc_library(
    name = "hdfs",
    hdrs = ["hdfs.h"],
)

所以下面真的調用bazel進行編譯的時候，需要顯示采用--config=opt來告知bazel，不要忽略opt這個package（這里是為了使用command:name中group這個特性）。

1.2 再bazel編譯

如果只編譯支持cpu的，敲如下代碼

$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

如果需要gpu支持的，敲如下代碼：

$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

1.2.1 BUILD文件結構格式推薦

在解讀tensorflow-master/tensorflow/tools/pip_package/BUILD的時候，需要溫習bazel構建時候的函數大全，還有官方推薦的BUILD文件結構格式File structure
. 如下形式：

Package description (a comment)

All load() statements

The package() function.

Calls to rules and macros

1.2.2 tensorflow/tools/pip_package/BUILD文件解讀

在下面的tensorflow/tools/pip_package/BUILD文件中，你可以看到package描述，load函數，transitive_hdrs，生成python的pb_binary，內部變量COMMON_PIP_DEPS，filegroup，生成shell的sh_binary和genrule等等。

# Description:
#  Tools for building the TensorFlow pip package.
#  原型：package(default_deprecation, default_testonly, default_visibility, features)
#  此函數聲明適用於包中每個后續規則的元數據。 它最多只能在一個包（BUILD文件）中使用一次。
#  此函數應該出現文件頂部，在所有load（）語句之后，任何規則之前的范圍內，調用package（）函數。
#  [package](https://docs.bazel.build/versions/master/be/functions.html#package)
# private表示后續的規則默認情況下只能在當前包內可見 https://docs.bazel.build/versions/master/be/common-definitions.html#common-attributes
package(default_visibility = ["//visibility:private"])    

# Bazel的擴展是以.bzl結尾的文件。 通過使用load語句從可以從bazel的擴展文件中導入對應符號到當前BUILD中使用。
# [load](https://docs.bazel.build/versions/master/skylark/concepts.html)
load(
    "//tensorflow:tensorflow.bzl",
    "if_not_windows",
    "if_windows",
    "transitive_hdrs",
)
load("//third_party/mkl:build_defs.bzl", "if_mkl")
load("//tensorflow:tensorflow.bzl", "if_cuda")
load("@local_config_tensorrt//:build_defs.bzl", "if_tensorrt")
load("//tensorflow/core:platform/default/build_config_root.bzl", "tf_additional_license_deps")

# This returns a list of headers of all public header libraries (e.g.,
# framework, lib), and all of the transitive dependencies of those
# public headers.  Not all of the headers returned by the filegroup
# are public (e.g., internal headers that are included by public
# headers), but the internal headers need to be packaged in the
# pip_package for the public headers to be properly included.
#
# Public headers are therefore defined by those that are both:
#
# 1) "publicly visible" as defined by bazel
# 2) Have documentation.
#
# This matches the policy of "public" for our python API.
transitive_hdrs(
    name = "included_headers",
    deps = [
        "//tensorflow/core:core_cpu",
        "//tensorflow/core:framework",
        "//tensorflow/core:lib",
        "//tensorflow/core:protos_all_cc",
        "//tensorflow/core:stream_executor",
        "//third_party/eigen3",
    ] + if_cuda([
        "@local_config_cuda//cuda:cuda_headers",
    ]),
)

py_binary(
    name = "simple_console",
    srcs = ["simple_console.py"],
    srcs_version = "PY2AND3",
    deps = ["//tensorflow:tensorflow_py"],
)

COMMON_PIP_DEPS = [
    ":licenses",
    "MANIFEST.in",
    "README",
    "setup.py",
    ":included_headers",
    "//tensorflow:tensorflow_py",
    "//tensorflow/contrib/autograph:autograph",
    "//tensorflow/contrib/autograph/converters:converters",
    "//tensorflow/contrib/autograph/converters:test_lib",
    "//tensorflow/contrib/autograph/impl:impl",
    "//tensorflow/contrib/autograph/pyct:pyct",
    "//tensorflow/contrib/autograph/pyct/static_analysis:static_analysis",
    "//tensorflow/contrib/boosted_trees:boosted_trees_pip",
    "//tensorflow/contrib/cluster_resolver:cluster_resolver_pip",
    "//tensorflow/contrib/data/python/kernel_tests:dataset_serialization_test",
    "//tensorflow/contrib/data/python/ops:contrib_op_loader",
    "//tensorflow/contrib/eager/python/examples:examples_pip",
    "//tensorflow/contrib/eager/python:checkpointable_utils",
    "//tensorflow/contrib/eager/python:evaluator",
    "//tensorflow/contrib/gan:gan",
    "//tensorflow/contrib/graph_editor:graph_editor_pip",
    "//tensorflow/contrib/keras:keras",
    "//tensorflow/contrib/labeled_tensor:labeled_tensor_pip",
    "//tensorflow/contrib/nn:nn_py",
    "//tensorflow/contrib/predictor:predictor_pip",
    "//tensorflow/contrib/proto:proto_pip",
    "//tensorflow/contrib/receptive_field:receptive_field_pip",
    "//tensorflow/contrib/rpc:rpc_pip",
    "//tensorflow/contrib/session_bundle:session_bundle_pip",
    "//tensorflow/contrib/signal:signal_py",
    "//tensorflow/contrib/signal:test_util",
    "//tensorflow/contrib/slim:slim",
    "//tensorflow/contrib/slim/python/slim/data:data_pip",
    "//tensorflow/contrib/slim/python/slim/nets:nets_pip",
    "//tensorflow/contrib/specs:specs",
    "//tensorflow/contrib/summary:summary_test_util",
    "//tensorflow/contrib/tensor_forest:init_py",
    "//tensorflow/contrib/tensor_forest/hybrid:hybrid_pip",
    "//tensorflow/contrib/timeseries:timeseries_pip",
    "//tensorflow/contrib/tpu",
    "//tensorflow/examples/tutorials/mnist:package",
    "//tensorflow/python:distributed_framework_test_lib",
    "//tensorflow/python:meta_graph_testdata",
    "//tensorflow/python:spectral_ops_test_util",
    "//tensorflow/python:util_example_parser_configuration",
    "//tensorflow/python/debug:debug_pip",
    "//tensorflow/python/eager:eager_pip",
    "//tensorflow/python/kernel_tests/testdata:self_adjoint_eig_op_test_files",
    "//tensorflow/python/saved_model:saved_model",
    "//tensorflow/python/tools:tools_pip",
    "//tensorflow/python:test_ops",
    "//tensorflow/tools/dist_test/server:grpc_tensorflow_server",
]

# On Windows, python binary is a zip file of runfiles tree.
# Add everything to its data dependency for generating a runfiles tree
# for building the pip package on Windows.
py_binary(
    name = "simple_console_for_windows",
    srcs = ["simple_console_for_windows.py"],
    data = COMMON_PIP_DEPS,
    srcs_version = "PY2AND3",
    deps = ["//tensorflow:tensorflow_py"],
)

filegroup(
    name = "licenses",
    data = [
        "//third_party/eigen3:LICENSE",
        "//third_party/fft2d:LICENSE",
        "//third_party/hadoop:LICENSE.txt",
        "@absl_py//absl/flags:LICENSE",
        "@arm_neon_2_x86_sse//:LICENSE",
        "@astor_archive//:LICENSE",
        "@aws//:LICENSE",
        "@boringssl//:LICENSE",
        "@com_google_absl//:LICENSE",
        "@com_googlesource_code_re2//:LICENSE",
        "@cub_archive//:LICENSE.TXT",
        "@curl//:COPYING",
        "@eigen_archive//:COPYING.MPL2",
        "@farmhash_archive//:COPYING",
        "@fft2d//:fft/readme.txt",
        "@flatbuffers//:LICENSE.txt",
        "@gast_archive//:PKG-INFO",
        "@gemmlowp//:LICENSE",
        "@gif_archive//:COPYING",
        "@grpc//:LICENSE",
        "@highwayhash//:LICENSE",
        "@jemalloc//:COPYING",
        "@jpeg//:LICENSE.md",
        "@kafka//:LICENSE",
        "@libxsmm_archive//:LICENSE",
        "@lmdb//:LICENSE",
        "@local_config_nccl//:LICENSE",
        "@local_config_sycl//sycl:LICENSE.text",
        "@grpc//third_party/nanopb:LICENSE.txt",
        "@grpc//third_party/address_sorting:LICENSE",
        "@nasm//:LICENSE",
        "@nsync//:LICENSE",
        "@pcre//:LICENCE",
        "@png_archive//:LICENSE",
        "@protobuf_archive//:LICENSE",
        "@six_archive//:LICENSE",
        "@snappy//:COPYING",
        "@swig//:LICENSE",
        "@termcolor_archive//:COPYING.txt",
        "@zlib_archive//:zlib.h",
        "@org_python_pypi_backports_weakref//:LICENSE",
    ] + if_mkl([
        "//third_party/mkl:LICENSE",
    ]) + tf_additional_license_deps(),
)

# 對應的shell二進制規則，其中涉及到了select(主要用來做平台依賴選擇)，在bazel的編譯命令中，並未顯式的指定build_pip_package的屬性，所以這里采用了默認的條件
# [select](https://docs.bazel.build/versions/master/skylark/lib/globals.html#select)
# [select](https://docs.bazel.build/versions/master/be/functions.html#select)
sh_binary(
    name = "build_pip_package",
    srcs = ["build_pip_package.sh"],
    data = select({
        "//tensorflow:windows": [":simple_console_for_windows"],
        "//tensorflow:windows_msvc": [":simple_console_for_windows"],
        "//conditions:default": COMMON_PIP_DEPS + [
            ":simple_console",
            "//tensorflow/contrib/lite/python:interpreter_test_data",
            "//tensorflow/contrib/lite/python:tf_lite_py_pip",
            "//tensorflow/contrib/lite/toco:toco",
            "//tensorflow/contrib/lite/toco/python:toco_wrapper",
            "//tensorflow/contrib/lite/toco/python:toco_from_protos",
        ],
    }) + if_mkl(["//third_party/mkl:intel_binary_blob"]) + if_tensorrt([
        "//tensorflow/contrib/tensorrt:init_py",
    ]),
)

# A genrule for generating a marker file for the pip package on Windows
#
# This only works on Windows, because :simple_console_for_windows is a
# python zip file containing everything we need for building the pip package.
# However, on other platforms, due to https://github.com/bazelbuild/bazel/issues/4223,
# when C++ extensions change, this generule doesn't rebuild.
genrule(
    name = "win_pip_package_marker",
    srcs = if_windows([
        ":build_pip_package",
        ":simple_console_for_windows",
    ]),
    outs = ["win_pip_package_marker_file"],
    cmd = select({
        "//conditions:default": "touch $@",
        "//tensorflow:windows": "md5sum $(locations :build_pip_package) $(locations :simple_console_for_windows) > $@",
    }),
    visibility = ["//visibility:public"],
)

1.2.3 編譯build_pip_package的過程

因編譯命令顯式的編譯build_pip_package，對應上述文件中的sh_binary。sh_binary中主要負責依賴的data的生成，其中基於平台依賴選用了select函數，且bazel命令行中並未對當前build_pip_package做顯式的選擇，所以讀取默認配置，

 COMMON_PIP_DEPS + [
            ":simple_console",
            "//tensorflow/contrib/lite/python:interpreter_test_data",
            "//tensorflow/contrib/lite/python:tf_lite_py_pip",
            "//tensorflow/contrib/lite/toco:toco",
            "//tensorflow/contrib/lite/toco/python:toco_wrapper",
            "//tensorflow/contrib/lite/toco/python:toco_from_protos",
        ]

那么現在焦點就轉移到COMMON_PIP_DEPS 部分了。該變量中，一開始的三個文件MANIFEST.in、README、setup.py是直接存在的，因此不會有什么操作。然后我們看下一行的

:included_headers

這里表示當前范圍內的target，所以是對應的

# This matches the policy of "public" for our python API.
transitive_hdrs(
    name = "included_headers",
    deps = [
        "//tensorflow/core:core_cpu",
        "//tensorflow/core:framework",
        "//tensorflow/core:lib",
        "//tensorflow/core:protos_all_cc",
        "//tensorflow/core:stream_executor",
        "//third_party/eigen3",
    ] + if_cuda([
        "@local_config_cuda//cuda:cuda_headers",
    ]),
)

而transitive_hdrs 並不是關鍵字類型的函數，是由上面的load導入的

load(
    "//tensorflow:tensorflow.bzl",
    "if_not_windows",
    "if_windows",
    "transitive_hdrs",
)

transitive_hdrs在tensorflow:tensorflow.bzl中的實現為

# Bazel rule for collecting the header files that a target depends on.
def _transitive_hdrs_impl(ctx):
  outputs = depset()
  for dep in ctx.attr.deps:
    outputs += dep.cc.transitive_headers
  return struct(files=outputs)

# 這里調用了對應的rule函數
# [rule](https://docs.bazel.build/versions/master/skylark/lib/globals.html#rule)
_transitive_hdrs = rule(
    attrs = {
        "deps": attr.label_list(
            allow_files = True,
            providers = ["cc"],
        ),
    },
    implementation = _transitive_hdrs_impl,
)
# transitive_hdrs所在的位置，其通過內部的_transitive_hdrs規則，而該規則是通過_transitive_hdrs_impl 實現的
def transitive_hdrs(name, deps=[], **kwargs):
  _transitive_hdrs(name=name + "_gather", deps=deps)
  native.filegroup(name=name, srcs=[":" + name + "_gather"])

這部分先放下，我們接着找和cpp交互的部分。我們先關注下，接下來的

"//tensorflow:tensorflow_py",

當前WORKSPACE所在的位置為根位置//，后面的tensorflow表示對應的tensorflow文件夾，后面的tensorflow_py可以定位到文件tensorflow/BUILD中

# 當前文件為tensorflow/BUILD的539-548行
py_library(
    name = "tensorflow_py",
    srcs = ["__init__.py"],
    srcs_version = "PY2AND3",
    visibility = ["//visibility:public"],
    deps = [
        "//tensorflow/python",
        "//tensorflow/tools/api/generator:python_api",
    ],
)

這里依賴於//tensorflow/python這個包，這個包依賴於tensorflow/python/BUILD進行生成，其內部

# 當前文件為tensorflow/python/BUILD
py_library(
    name = "python",
    srcs = ["__init__.py"],
    srcs_version = "PY2AND3",
    visibility = [
        "//tensorflow:__pkg__",
        "//tensorflow/compiler/aot/tests:__pkg__",  # TODO(b/34059704): remove when fixed
        "//tensorflow/contrib/learn:__pkg__",  # TODO(b/34059704): remove when fixed
        "//tensorflow/contrib/learn/python/learn/datasets:__pkg__",  # TODO(b/34059704): remove when fixed
        "//tensorflow/contrib/lite/toco/python:__pkg__",  # TODO(b/34059704): remove when fixed
        "//tensorflow/python/debug:__pkg__",  # TODO(b/34059704): remove when fixed
        "//tensorflow/python/tools:__pkg__",  # TODO(b/34059704): remove when fixed
        "//tensorflow/tools/api/generator:__pkg__",
        "//tensorflow/tools/quantization:__pkg__",  # TODO(b/34059704): remove when fixed
    ],
    deps = [
        ":no_contrib",
        "//tensorflow/contrib:contrib_py",
    ],
)

這里依賴於:no_contrib 這個target，那么我們關注下

# 當前文件為tensorflow/python/BUILD
py_library(
    name = "no_contrib",
    srcs = ["__init__.py"],
    srcs_version = "PY2AND3",
    visibility = [
        "//tensorflow:__pkg__",
    ],
    deps = [
        ":array_ops",
        ":bitwise_ops",
        ":boosted_trees_ops",
        ":check_ops",
        ":client",
        ":client_testlib",
        ":confusion_matrix",
        ":control_flow_ops",
        ":cudnn_rnn_ops_gen",
        ":errors",
        ":framework",
        ":framework_for_generated_wrappers",
        ":functional_ops",
        ":gradient_checker",
        ":graph_util",
        ":histogram_ops",
        ":image_ops",
        ":initializers_ns",
        ":io_ops",
        ":layers",
        ":lib",
        ":list_ops",
        ":manip_ops",
        ":math_ops",
        ":metrics",
        ":nn",
        ":ops",
        ":platform",
        ":pywrap_tensorflow",
        ":saver_test_utils",
        ":script_ops",
        ":session_ops",
        ":sets",
        ":sparse_ops",
        ":spectral_ops",
        ":spectral_ops_test_util",
        ":standard_ops",
        ":state_ops",
        ":string_ops",
        ":subscribe",
        ":summary",
        ":tensor_array_ops",
        ":test_ops",  # TODO: Break testing code out into separate rule.
        ":tf_cluster",
        ":tf_item",
        ":tf_optimizer",
        ":training",
        ":util",
        ":weights_broadcast_ops",
        "//tensorflow/core:protos_all_py",
        "//tensorflow/python/data",
        "//tensorflow/python/estimator:estimator_py",
        "//tensorflow/python/feature_column:feature_column_py",
        "//tensorflow/python/keras",
        "//tensorflow/python/ops/distributions",
        "//tensorflow/python/ops/linalg",
        "//tensorflow/python/ops/losses",
        "//tensorflow/python/profiler",
        "//tensorflow/python/saved_model",
        "//third_party/py/numpy",
    ],
)

我們也跟隨.如何閱讀TensorFlow源碼去找pywrap_tensorflow這個部分,其中pywrap_tensorflow target依賴於pywrap_tensorflow_internal這個target的，而pywrap_tensorflow_internal就是通過swig從cc文件生成對應的python接口文件部分了

# 當前文件為tensorflow/python/BUILD 3421行
py_library(
    name = "pywrap_tensorflow",
    srcs = [
        "pywrap_tensorflow.py",
    ] + if_static(
        ["pywrap_dlopen_global_flags.py"],
        # Import will fail, indicating no global dlopen flags
        otherwise = [],
    ),
    srcs_version = "PY2AND3",
    deps = [":pywrap_tensorflow_internal"],
)
tf_py_wrap_cc(
    name = "pywrap_tensorflow_internal",
    srcs = ["tensorflow.i"],
    swig_includes = [
        "client/device_lib.i",
        "client/events_writer.i",
        "client/tf_session.i",
        "client/tf_sessionrun_wrapper.i",
        "framework/cpp_shape_inference.i",
        "framework/python_op_gen.i",
        "grappler/cluster.i",
        "grappler/cost_analyzer.i",
        "grappler/item.i",
        "grappler/model_analyzer.i",
        "grappler/tf_optimizer.i",
        "lib/core/bfloat16.i",
        "lib/core/py_exception_registry.i",
        "lib/core/py_func.i",
        "lib/core/strings.i",
        "lib/io/file_io.i",
        "lib/io/py_record_reader.i",
        "lib/io/py_record_writer.i",
        "platform/base.i",
        "platform/stacktrace_handler.i",
        "pywrap_tfe.i",
        "training/quantize_training.i",
        "training/server_lib.i",
        "util/kernel_registry.i",
        "util/port.i",
        "util/py_checkpoint_reader.i",
        "util/stat_summarizer.i",
        "util/tfprof.i",
        "util/transform_graph.i",
        "util/util.i",
    ],
    win_def_file = select({
        "//tensorflow:windows": ":pywrap_tensorflow_filtered_def_file",
        "//conditions:default": None,
    }),
    deps = [
        ":bfloat16_lib",
        ":cost_analyzer_lib",
        ":model_analyzer_lib",
        ":cpp_python_util",
        ":cpp_shape_inference",
        ":kernel_registry",
        ":numpy_lib",
        ":safe_ptr",
        ":py_exception_registry",
        ":py_func_lib",
        ":py_record_reader_lib",
        ":py_record_writer_lib",
        ":python_op_gen",
        ":tf_session_helper",
        "//tensorflow/c:c_api",
        "//tensorflow/c:checkpoint_reader",
        "//tensorflow/c:python_api",
        "//tensorflow/c:tf_status_helper",
        "//tensorflow/c/eager:c_api",
        "//tensorflow/core/distributed_runtime/rpc:grpc_rpc_factory_registration",
        "//tensorflow/core/distributed_runtime/rpc:grpc_server_lib",
        "//tensorflow/core/distributed_runtime/rpc:grpc_session",
        "//tensorflow/core/grappler:grappler_item",
        "//tensorflow/core/grappler:grappler_item_builder",
        "//tensorflow/core/grappler/clusters:cluster",
        "//tensorflow/core/grappler/clusters:single_machine",
        "//tensorflow/core/grappler/clusters:virtual_cluster",
        "//tensorflow/core/grappler/costs:graph_memory",
        "//tensorflow/core/grappler/optimizers:meta_optimizer",
        "//tensorflow/core:lib",
        "//tensorflow/core:reader_base",
        "//tensorflow/core/debug",
        "//tensorflow/core/distributed_runtime:server_lib",
        "//tensorflow/core/profiler/internal:print_model_analysis",
        "//tensorflow/tools/graph_transforms:transform_graph_lib",
        "//tensorflow/python/eager:pywrap_tfe_lib",
        "//tensorflow/python/eager:python_eager_op_gen",
        "//util/python:python_headers",
    ] + (tf_additional_lib_deps() +
         tf_additional_plugin_deps() +
         tf_additional_verbs_deps() +
         tf_additional_mpi_deps() +
         tf_additional_gdr_deps()),
)

而tf_py_wrap_cc不是在bazel內置的規則中，所以是tensorflow自定義的一個規則，通過

load("//tensorflow:tensorflow.bzl", "tf_py_wrap_cc")

找到其實現為

# 此文件為tensorflow/tensorflow.bzl   1404行
def tf_py_wrap_cc(name,
                             srcs,
                             swig_includes=[],
                             deps=[],
                             copts=[],
                             **kwargs):
  module_name = name.split("/")[-1]
  # Convert a rule name such as foo/bar/baz to foo/bar/_baz.so
  # and use that as the name for the rule producing the .so file.
  cc_library_name = "/".join(name.split("/")[:-1] + ["_" + module_name + ".so"])
  cc_library_pyd_name = "/".join(
      name.split("/")[:-1] + ["_" + module_name + ".pyd"])
  extra_deps = []
  _py_wrap_cc(
      name=name + "_py_wrap",
      srcs=srcs,
      swig_includes=swig_includes,
      deps=deps + extra_deps,
      toolchain_deps=["//tools/defaults:crosstool"],
      module_name=module_name,
      py_module_name=name)
  vscriptname=name+"_versionscript"
  _append_init_to_versionscript(
      name=vscriptname,
      module_name=module_name,
      is_version_script=select({
          "@local_config_cuda//cuda:darwin":False,
          "//conditions:default":True,
          }),
      template_file=select({
          "@local_config_cuda//cuda:darwin":clean_dep("//tensorflow:tf_exported_symbols.lds"),
          "//conditions:default":clean_dep("//tensorflow:tf_version_script.lds")
      })
  )
  extra_linkopts = select({
      "@local_config_cuda//cuda:darwin": [
          "-Wl,-exported_symbols_list",
          "%s.lds"%vscriptname,
      ],
      clean_dep("//tensorflow:windows"): [],
      clean_dep("//tensorflow:windows_msvc"): [],
      "//conditions:default": [
          "-Wl,--version-script",
          "%s.lds"%vscriptname,
      ]
  })
  extra_deps += select({
      "@local_config_cuda//cuda:darwin": [
          "%s.lds"%vscriptname,
      ],
      clean_dep("//tensorflow:windows"): [],
      clean_dep("//tensorflow:windows_msvc"): [],
      "//conditions:default": [
          "%s.lds"%vscriptname,
      ]
  })

  tf_cc_shared_object(
      name=cc_library_name,
      srcs=[module_name + ".cc"],
      copts=(copts + if_not_windows([
          "-Wno-self-assign", "-Wno-sign-compare", "-Wno-write-strings"
      ]) + tf_extension_copts()),
      linkopts=tf_extension_linkopts() + extra_linkopts,
      linkstatic=1,
      deps=deps + extra_deps,
      **kwargs)
  native.genrule(
      name="gen_" + cc_library_pyd_name,
      srcs=[":" + cc_library_name],
      outs=[cc_library_pyd_name],
      cmd="cp $< $@",)
  native.py_library(
      name=name,
      srcs=[":" + name + ".py"],
      srcs_version="PY2AND3",
      data=select({
          clean_dep("//tensorflow:windows"): [":" + cc_library_pyd_name],
          "//conditions:default": [":" + cc_library_name],
      }))

因為swig是你編寫好對應的example.c文件和example.i文件，然后通過調用swig命令生成example_wrap.c文件，通過gcc編譯這2個c文件，就能生成對應的o文件，通過連接生成so文件，這時候就能夠被python導入了。
上述自定義規則中

tf_cc_shared_object 負責生成 so文件;

而native.py_library負責???

_py_wrap_cc則負責執行swig的命令，該自定義規則在同文件的1122行

# 此文件為tensorflow/tensorflow.bzl   1090行,下面的1122行就是_py_wrap_cc的位置
# Bazel rules for building swig files.
def _py_wrap_cc_impl(ctx):
  srcs = ctx.files.srcs
  if len(srcs) != 1:
    fail("Exactly one SWIG source file label must be specified.", "srcs")
  module_name = ctx.attr.module_name
  src = ctx.files.srcs[0]
  inputs = depset([src])
  inputs += ctx.files.swig_includes
  for dep in ctx.attr.deps:
    inputs += dep.cc.transitive_headers
  inputs += ctx.files._swiglib
  inputs += ctx.files.toolchain_deps
  swig_include_dirs = depset(_get_repository_roots(ctx, inputs))
  swig_include_dirs += sorted([f.dirname for f in ctx.files._swiglib])
  args = [
      "-c++", "-python", "-module", module_name, "-o", ctx.outputs.cc_out.path,
      "-outdir", ctx.outputs.py_out.dirname
  ]
  args += ["-l" + f.path for f in ctx.files.swig_includes]
  args += ["-I" + i for i in swig_include_dirs]
  args += [src.path]
  outputs = [ctx.outputs.cc_out, ctx.outputs.py_out]
  ctx.action(
      executable=ctx.executable._swig,
      arguments=args,
      inputs=list(inputs),
      outputs=outputs,
      mnemonic="PythonSwig",
      progress_message="SWIGing " + src.path)
  return struct(files=depset(outputs))

_py_wrap_cc = rule(
    attrs = {
        "srcs": attr.label_list(
            mandatory = True,
            allow_files = True,
        ),
        "swig_includes": attr.label_list(
            cfg = "data",
            allow_files = True,
        ),
        "deps": attr.label_list(
            allow_files = True,
            providers = ["cc"],
        ),
        "toolchain_deps": attr.label_list(
            allow_files = True,
        ),
        "module_name": attr.string(mandatory = True),
        "py_module_name": attr.string(mandatory = True),
        "_swig": attr.label(
            default = Label("@swig//:swig"),
            executable = True,
            cfg = "host",
        ),
        "_swiglib": attr.label(
            default = Label("@swig//:templates"),
            allow_files = True,
        ),
    },
    outputs = {
        "cc_out": "%{module_name}.cc",
        "py_out": "%{py_module_name}.py",
    },
    implementation = _py_wrap_cc_impl,
)

上述中ctx.executable._swig 是為執行部分，其對應的

        "_swig": attr.label(
            default = Label("@swig//:swig"),
            executable = True,
            cfg = "host",
        ),

而swig就位於third_party/swig.BUILD中

licenses(["restricted"])  # GPLv3

exports_files(["LICENSE"])

cc_binary(
    name = "swig",
    srcs = [
        "Source/CParse/cparse.h",
        "Source/CParse/cscanner.c",
        "Source/CParse/parser.c",
        "Source/CParse/parser.h",
        "Source/CParse/templ.c",
        "Source/CParse/util.c",
        "Source/DOH/base.c",
        "Source/DOH/doh.h",
        "Source/DOH/dohint.h",
        "Source/DOH/file.c",
        "Source/DOH/fio.c",
        "Source/DOH/hash.c",
        "Source/DOH/list.c",
        "Source/DOH/memory.c",
        "Source/DOH/string.c",
        "Source/DOH/void.c",
        "Source/Include/swigconfig.h",
        "Source/Include/swigwarn.h",
        "Source/Modules/allocate.cxx",
        "Source/Modules/browser.cxx",
        "Source/Modules/contract.cxx",
        "Source/Modules/directors.cxx",
......

可以看成swig是需要先定義生成的一個target。這樣基本一個流程就串起來了

先生成swig可執行文件

再通過對應i文件生成對應的wrap文件，並進行編譯生成對應的so文件和py文件

就可以正常導入了

1.2.4 如何從python端找到對應的c源碼文件

假設有python文件如下

import tensorflow as tf
import numpy as np
 
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.1 + 0.3
 
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b
 
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)
 
init = tf.initialize_all_variables()
 
sess = tf.Session()
sess.run(init)
 
for step in range(0, 201):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(W), sess.run(b))

假設想找到Session的位置,最后在/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py中找到對應的類

1.2.5 python和cpp函數名的對應

在底層cpp代碼中，google采用的是駝峰形式去編寫cpp的代碼，如AbcDefGh，而前端語言python遵循的是小寫下划線的方式，如abc_def_gh，所以這二者中，tensorflow內置了一個轉換函數，在
/tensorflow/python/framework/python_op_gen.cc

string function_name;
    python_op_gen_internal::GenerateLowerCaseOpName(op_def.name(),
                                                    &function_name);

在后續版本中，該函數的定義遷移到了/tensorflow/python/framework/python_op_gen_internal.cc

void GenerateLowerCaseOpName(const string& str, string* result) {
  const char joiner = '_';
  const int last_index = str.size() - 1;
  for (int i = 0; i <= last_index; ++i) {
    const char c = str[i];
    // Emit a joiner only if a previous-lower-to-now-upper or a
    // now-upper-to-next-lower transition happens.
    if (isupper(c) && (i > 0)) {
      if (islower(str[i - 1]) || ((i < last_index) && islower(str[i + 1]))) {
        result->push_back(joiner);
      }
    }
    result->push_back(tolower(c));
  }
}

如我們要找tf.conv2d在cpp中的實現就是去找Conv2d。

參考資料：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 bazel和TensorFlow安裝 tensorflow和bazel版本對應問題及對應的bazel安裝 tensorflow lite 編譯和安裝二使用bazel編譯 Bazel - 在 Windows 上安裝 Bazel Google Tensorflow 源碼編譯（二）：Bazel bazel 安裝 bazel安裝 Bazel編譯tensorflow TensorFlow入門——bazel編譯（帶GPU） Bazel 編譯工具; tensorflow 編譯