利用tensorboard将数据可视化


注:代码是网上下载的,但是找不到原始出处了,侵权则删

先写出visual类:

class TF_visualizer(object):
    def __init__(self, dimension, vecs_file, metadata_file, output_path):
        self.dimension = dimension
        self.vecs_file = vecs_file
        self.metadata_file = metadata_file
        self.output_path = output_path
        
        self.vecs = []
        with open(self.vecs_file, 'r') as vecs:
        #with open(self.vecs_file, 'rb') as vecs:
            for i, line in enumerate(vecs):
                if line != '': self.vecs.append(line)

    def visualize(self):
        # adding into projector
        config = projector.ProjectorConfig()

        placeholder = np.zeros((len(self.vecs), self.dimension))
        
        for i, line in enumerate( self.vecs ):   
            placeholder[i] = np.fromstring(line, sep=',')
        #for i,line in enumerate(self.vecs):
        #    placeholder[i] = np.fromstring(line)

        embedding_var = tf.Variable(placeholder, trainable=False, name='amazon')

        embed = config.embeddings.add()
        embed.tensor_name = embedding_var.name
        embed.metadata_path = self.metadata_file

        # define the model without training
        sess = tf.InteractiveSession()
        
        tf.global_variables_initializer().run()
        saver = tf.train.Saver()
        
        saver.save(sess, os.path.join(self.output_path, 'w2x_metadata.ckpt'))

        writer = tf.summary.FileWriter(self.output_path, sess.graph)
        projector.visualize_embeddings(writer, config)
        sess.close()
        print('Run `tensorboard --logdir={0}` to run visualize result on tensorboard'.format(self.output_path))

然后调用类:

output = '/home/xx'

# create a new tensor board visualizer
visualizer = TF_visualizer(dimension = 768,
                           vecs_file = os.path.join(output, 'amazon_vec.tsv'),
                           #vecs_file = os.path.join(output, 'mnist_10k_784d_tensors.bytes'),
                           metadata_file = os.path.join(output, 'amazon.tsv'),
                           output_path = output)
visualizer.visualize()

其中,amazon_vec.tsv中存放向量(包括词向量,句子向量...),amazon.tsv中存放原始数据,格式为id,label,title,id和title可以随意定义,label则为对应向量的标识,两个文件是 一一对应的(即amazon_vec中的第一行数据对应amazon中第一行数据)

最后,命令行输入

tensorboard --logdir=/home/xx

在浏览器输入http://xx-desktop:6006即可看到可视化的数据(6006是默认端口)


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM