Tensorflow之feature_column

本文轉載自查看原文 2018-11-26 20:30 1624 機器學習

官網：https://tensorflow.google.cn/guide/feature_columns

參考：https://blog.csdn.net/cjopengler/article/details/78161748

numeric_column

numeric_column(
    key,
    shape=(1,),
    default_value=None,
    dtype=tf.float32,
    normalizer_fn=None
)

key: 特征的名字。也就是對應的列名稱。
shape: 該key所對應的特征的shape. 默認是1，但是比如one-hot類型的，shape就不是1，而是實際的維度。總之，這里是key所對應的維度，不一定是1.
default_value: 如果不存在使用的默認值
normalizer_fn: 對該特征下的所有數據進行轉換。如果需要進行normalize，那么就是使用normalize的函數.這里不僅僅局限於normalize，也可以是任何的轉換方法，比如取對數，取指數，這僅僅是一種變換方法.

import tensorflow as tf

tf.enable_eager_execution()

def Test_numeric_column():
    features = {'sales' : [[5], [10], [8], [9]]}
    columns = [tf.feature_column.numeric_column('sales')]
    inputs = tf.feature_column.input_layer(features, columns)
    print(inputs)
Test_numeric_column()

View Code

bucketized_column

bucketized_column(
    source_column,
    boundaries
)

source_column: 必須是numeric_column
boundaries: 不同的桶。boundaries=[0., 1., 2.],產生的bucket就是, (-inf, 0.), [0., 1.), [1., 2.), and [2., +inf), 每一個區間分別表示0, 1, 2, 3,所以相當於分桶分了4個.

import tensorflow as tf
from tensorflow import feature_column as fc

tf.enable_eager_execution()

def test_bucketized_column():
    price = {'price': [[5.], [15.], [25.], [35.]]}  # 4行樣本
    price_column = fc.numeric_column('price')
    bucket_price = fc.bucketized_column(price_column, [0, 10, 20, 30, 40])
    price_bucket_tensor = fc.input_layer(price, [bucket_price])
    print(price_bucket_tensor)
test_bucketized_column()

View Code

categorical_column_with_vocabulary_list

categorical_column_with_vocabulary_list(
    key,
    vocabulary_list,
    dtype=None,
    default_value=-1,
    num_oov_buckets=0
)

key: feature名字
vocabulary_list: 對於category來說，進行轉換的list.也就是category列表.
dtype: 僅僅string和int被支持，其他的類型是無法進行這個操作的.
default_value: 當不在vocabulary_list中的默認值，這時候num_oov_buckets必須是0.
num_oov_buckets: 用來處理那些不在vocabulary_list中的值，如果是0，那么使用default_value進行填充;如果大於0，則會在[len(vocabulary_list), len(vocabulary_list)+num_oov_buckets]這個區間上重新計算當前特征的值.
與前面numeric 不同的是，這里返回的是稀疏tensor.

import tensorflow as tf
from tensorflow import feature_column as fc

tf.enable_eager_execution()

def categorical_column_with_vocabulary_list():
    color_data = {'color': [['R', 'R'], ['G', 'R'], ['B', 'G'], ['A', 'A']]}  # 4行樣本
    color_column = fc.categorical_column_with_vocabulary_list('color', ['R', 'G', 'B'], dtype=tf.string, default_value=-1)
    color_column_identy = fc.indicator_column(color_column)
    color_dense_tensor = fc.input_layer(color_data, [color_column_identy])
    print(color_dense_tensor)
categorical_column_with_vocabulary_list()

View Code

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 tensorflow踩坑合集1. feature_column 【tensorflow2.0】特征列feature_column tensorflow feature column tensorflow share embedding feature column feature_column、fc.input_layer以及各種類型的column如何轉化 TensorFlow與caffe中卷積層feature map大小計算 Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with - 解決Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A @Column I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2