tensorflow 讀取訓練集文件 from Hadoop

本文轉載自查看原文 2019-05-14 21:56 1689 [T07] Tensorflow/ [T13] 大數據倉儲/ hadoop/ 分布式

1、代碼配置

filename_queue = tf.train.string_input_producer([
    "hdfs://namenode:8020/path/to/file1.csv",
    "hdfs://namenode:8020/path/to/file2.csv",
])

filename_queue = tf.train.string_input_producer([
    "hdfs://namenode:9000/path/to/file1.tfrecord",
    "hdfs://namenode:9000/path/to/file2.tfrecord",
])

def read_tfrecords(filename_queue):
    key, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            'label': tf.FixedLenFeature(shape=[label_dims], dtype=data_type),
            'image': tf.FixedLenFeature(shape=[steps * width * height * channels], dtype=tf.float32)
        }
    )
    label = features['label']
    image = features['image']
    return image, label

2、環境配置

　  JAVA_HOME

　　HADOOP_HFDS_HOME

　　LD_LIBRARY_PATH 

　　CLASSPATH

eg：

　　vi ~/.bashrc

export JAVA_HOME=/home/user/java/jdk1.8.0_05
export HADOOP_HDFS_HOME=/home/user/software/hadoop-2.7.6/
export PATH=$PATH:$HADOOP_HDFS_HOME/libexec/hadoop-config.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64/server
export PATH=$PATH:$HADOOP_HDFS_HOME/bin:$HADOOP_HDFS_HOME/sbin
export CLASSPATH="$(hadoop classpath --glob)"

　　source ~/.bashrc

3、使用

　　此時就可以訪問Hadoop系統上的文件了　　file = "hdfs://namenode:8020/path/to/file1.tfrecords",

　　python your_script.py

參考文件

https://medium.com/@matthewyeung/hadoop-file-system-with-tensorflow-dataset-api-13ce9aeaa107

https://github.com/tensorflow/examples/blob/master/community/en/docs/deploy/hadoop.md

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 tensorflow讀取數據集生成batch——tf.data.Dataset.from_tensor_slices tensorflow 讀取event文件 tensorflow文件讀取 tensorflow讀取本地MNIST數據集 tensorflow版本的tansformer訓練IWSLT數據集 tensorflow yolov3訓練自己的數據集，詳細教程 tensorflow訓練自己的數據集實現CNN圖像分類1 深度學習tensorflow實戰筆記（1）全連接神經網絡（FCN）訓練自己的數據（從txt文件中讀取） TensorFlow學習筆記——LeNet-5（訓練自己的數據集） TensorFlow 訓練MNIST數據集（2）—— 多層神經網絡