tensorflow 读取训练集文件 from Hadoop

本文转载自查看原文 2019-05-14 21:56 1689 [T07] Tensorflow/ [T13] 大数据仓储/ hadoop/ 分布式

1、代码配置

filename_queue = tf.train.string_input_producer([
    "hdfs://namenode:8020/path/to/file1.csv",
    "hdfs://namenode:8020/path/to/file2.csv",
])

filename_queue = tf.train.string_input_producer([
    "hdfs://namenode:9000/path/to/file1.tfrecord",
    "hdfs://namenode:9000/path/to/file2.tfrecord",
])

def read_tfrecords(filename_queue):
    key, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            'label': tf.FixedLenFeature(shape=[label_dims], dtype=data_type),
            'image': tf.FixedLenFeature(shape=[steps * width * height * channels], dtype=tf.float32)
        }
    )
    label = features['label']
    image = features['image']
    return image, label

2、环境配置

　  JAVA_HOME

　　HADOOP_HFDS_HOME

　　LD_LIBRARY_PATH 

　　CLASSPATH

eg：

　　vi ~/.bashrc

export JAVA_HOME=/home/user/java/jdk1.8.0_05
export HADOOP_HDFS_HOME=/home/user/software/hadoop-2.7.6/
export PATH=$PATH:$HADOOP_HDFS_HOME/libexec/hadoop-config.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64/server
export PATH=$PATH:$HADOOP_HDFS_HOME/bin:$HADOOP_HDFS_HOME/sbin
export CLASSPATH="$(hadoop classpath --glob)"

　　source ~/.bashrc

3、使用

　　此时就可以访问Hadoop系统上的文件了　　file = "hdfs://namenode:8020/path/to/file1.tfrecords",

　　python your_script.py

参考文件

https://medium.com/@matthewyeung/hadoop-file-system-with-tensorflow-dataset-api-13ce9aeaa107

https://github.com/tensorflow/examples/blob/master/community/en/docs/deploy/hadoop.md

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 tensorflow读取数据集生成batch——tf.data.Dataset.from_tensor_slices tensorflow 读取event文件 tensorflow文件读取 tensorflow读取本地MNIST数据集 tensorflow版本的tansformer训练IWSLT数据集 tensorflow yolov3训练自己的数据集，详细教程 tensorflow训练自己的数据集实现CNN图像分类1 深度学习tensorflow实战笔记（1）全连接神经网络（FCN）训练自己的数据（从txt文件中读取） TensorFlow学习笔记——LeNet-5（训练自己的数据集） TensorFlow 训练MNIST数据集（2）—— 多层神经网络