對鳶尾花識別之tensorflow

任務目標

對鳶尾花數據集分析
建立鳶尾花的模型
利用模型預測鳶尾花的類別

環境搭建

pycharm編輯器搭建python3.*
第三方庫

tensorflow1.*
numpy
pandas
sklearn
keras

處理鳶尾花數據集

了解數據集

鳶尾花數據集是一個經典的機器學習數據集，非常適合用來入門。
鳶尾花數據集鏈接：下載鳶尾花數據集
鳶尾花數據集包含四個特征和一個標簽。這四個特征確定了單株鳶尾花的下列植物學特征：

花萼長度
花萼寬度
花瓣長度
花瓣寬度

該表確定了鳶尾花品種，品種必須是下列任意一種：

山鳶尾 Iris-Setosa(0)
雜色鳶尾 Iris-versicolor(1)
維吉尼亞鳶尾 Iris-virginica(2)

數據集中三類鳶尾花各含有50個樣本，共150各樣本

下面顯示了數據集中的樣本:

機器學習中，為了保證測試結果的准確性，一般會從數據集中抽取一部分數據專門留作測試，其余數據用於訓練。所以我將數據集按7：3（訓練集：測試集）的比例進行划分。

數據集處理具體代碼

def dealIrisData(IrisDatapath):
    """
    :param IrisDatapath:傳入數據集路徑 
    :return: 返回 訓練特征集，測試特征集，訓練標簽集，測試標簽集
    """
    # 讀取數據集
    iris = pd.read_csv(IrisDatapath, header=None)

    # 數據集轉化成數組
    iris = np.array(iris)
    # 提取特征集
    X = iris[:, 0:4]
    # 提取標簽集
    Y = iris[:, 4]

    # One-Hot編碼
    encoder = LabelEncoder()
    Y = encoder.fit_transform(Y)
    Y = np_utils.to_categorical(Y)

    x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
    return x_train,x_test,y_train,y_test

什么是one-hot編碼？

One-Hot編碼，又稱為一位有效編碼，主要是采用N位狀態寄存器來對N個狀態進行編碼，每個狀態都由他獨立的寄存器位，並且在任意時候只有一位有效。
One-Hot編碼是分類變量作為二進制向量的表示。這首先要求將分類值映射到整數值。然后，每個整數值被表示為二進制向量，除了整數的索引之外，它都是零值，它被標記為1。
One-Hot編碼是將類別變量轉換為機器學習算法易於利用的一種形式的過程。
比如：["山鳶尾","雜色鳶尾","維吉尼亞鳶尾"]---->[[1,0,0][0,1,0][0,0,1]]

模型建立

由於結構簡單並沒有建立隱藏層。
建立模型代碼

def getIrisModel(saveModelPath,step):
    """
    :param saveModelPath: 模型保存路徑
    :param step: 訓練步數
    :return: None
    """
    x_train, x_test, y_train, y_test = dealIrisData("iris.data")
    # 輸入層
    with tf.variable_scope("data"):
        x = tf.placeholder(tf.float32,[None,4])
        y_true = tf.placeholder(tf.int32,[None,3])
        # placeholder()函數是在神經網絡構建graph的時候在模型中的占位，此時並沒有把要輸入的數據傳入模型，
        # 它只會分配必要的內存。等建立session，在會話中，運行模型的時候通過feed_dict()函數向占位符喂入數據。

    # 無隱藏層

    # 輸出層
    with tf.variable_scope("fc_model"):
        weight = tf.Variable(tf.random_normal([4,3],mean=0.0,stddev=1.0)) # 創建一個形狀為[4,3]，均值為0，方差為1的正態分布隨機值變量
        bias = tf.Variable(tf.constant(0.0,shape=[3])) # 創建 張量為0，形狀為3變量
        y_predict = tf.matmul(x,weight)+bias # 矩陣相乘
        # Variable()創建一個變量
    # 誤差
    with tf.variable_scope("loss"):
        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true,logits=y_predict))
    # 優化器
    with tf.variable_scope("optimizer"):
        train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
    # 准確率
    with tf.variable_scope("acc"):
        equal_list = tf.equal(tf.arg_max(y_true,1),tf.arg_max(y_predict,1))
        accuracy = tf.reduce_mean(tf.cast(equal_list,tf.float32))
    # 開始訓練
    with tf.Session() as sess:
        saver = tf.train.Saver()
        sess.run(tf.global_variables_initializer())
        for i in range(step):
            _train = sess.run(train_op, feed_dict={x: x_train, y_true: y_train})
            _acc = sess.run(accuracy, feed_dict={x: x_train, y_true: y_train})
            print("訓練%d步，准確率為%.2f" % (i + 1, _acc))
        print("測試集的准確率為%.2f" %sess.run(accuracy, feed_dict={x: x_test, y_true: y_test}))
        saver.save(sess, saveModelPath)

載入模型—預測鳶尾花

saver.restore()時填的文件名，因為在saver.save的時候，每個checkpoint會保存三個文件，如 modelIris.meta,modelIris.index, modelIris.data-00000-of-00001
在import_meta_graph時填的就是meta文件名，我們知道權值都保存在modelIris.data-00000-of-00001這個文件中，但是如果在restore方法中填這個文件名，就會報錯，應該填的是前綴，這個前綴可以使用tf.train.latest_checkpoint(checkpoint_dir)這個方法獲取。
模型的y中有用到placeholder，在sess.run()的時候肯定要feed對應的數據，因此還要根據具體placeholder的名字，從graph中使用get_operation_by_name方法獲取。
代碼實現

def predictIris(modelPath,data):
    """
    :param modelPath: 載入模型路徑 
    :param data: 預測數據
    :return: None
    """
    with tf.Session() as sess:
        #
        new_saver = tf.train.import_meta_graph("model/iris_model.meta")
        new_saver.restore(sess,"model/iris_model")
        graph = tf.get_default_graph()
        x = graph.get_operation_by_name('data/x_pred').outputs[0]
        y = tf.get_collection("pred_network")[0]
        predict = np.argmax(sess.run(y,feed_dict={x:data}))
        if predict == 0:
            print("山鳶尾 Iris-Setosa")
        elif predict == 1:
            print("雜色鳶尾 Iris-versicolor")
        else:
            print("維吉尼亞鳶尾 Iris-virginica")

整體代碼

import tensorflow as tf
import numpy as np
import pandas as pd
from keras.utils import np_utils
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1" # 不啟動GPU

def dealIrisData(IrisDatapath):
    """
    :param IrisDatapath:傳入數據集路徑
    :return: 返回 訓練特征集，測試特征集，訓練標簽集，測試標簽集
    """
    # 讀取數據集
    iris = pd.read_csv(IrisDatapath, header=None)

    # 數據集轉化成數組
    iris = np.array(iris)
    # 提取特征集
    X = iris[:, 0:4]
    # 提取標簽集
    Y = iris[:, 4]

    # One-Hot編碼
    encoder = LabelEncoder()
    Y = encoder.fit_transform(Y)
    Y = np_utils.to_categorical(Y)

    x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
    return x_train,x_test,y_train,y_test
def getIrisModel(saveModelPath,step):
    """
    :param saveModelPath: 模型保存路徑
    :param step: 訓練步數
    :return: None
    """
    x_train, x_test, y_train, y_test = dealIrisData("iris.data")
    # 輸入層
    with tf.variable_scope("data"):
        x = tf.placeholder(tf.float32,[None,4],name='x_pred')
        y_true = tf.placeholder(tf.int32,[None,3])
        # placeholder()函數是在神經網絡構建graph的時候在模型中的占位，此時並沒有把要輸入的數據傳入模型，
        # 它只會分配必要的內存。等建立session，在會話中，運行模型的時候通過feed_dict()函數向占位符喂入數據。

    # 無隱藏層

    # 輸出層
    with tf.variable_scope("fc_model"):
        weight = tf.Variable(tf.random_normal([4,3],mean=0.0,stddev=1.0)) # 創建一個形狀為[4,3]，均值為0，方差為1的正態分布隨機值變量
        bias = tf.Variable(tf.constant(0.0,shape=[3])) # 創建 張量為0，形狀為3變量
        y_predict = tf.matmul(x,weight)+bias # 矩陣相乘
        tf.add_to_collection('pred_network', y_predict)  # 用於加載模型獲取要預測的網絡結構
        # Variable()創建一個變量
    # 誤差
    with tf.variable_scope("loss"):
        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true,logits=y_predict))
    # 優化器
    with tf.variable_scope("optimizer"):
        train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
    # 准確率
    with tf.variable_scope("acc"):
        equal_list = tf.equal(tf.arg_max(y_true,1),tf.arg_max(y_predict,1))
        accuracy = tf.reduce_mean(tf.cast(equal_list,tf.float32))
    # 開始訓練
    with tf.Session() as sess:
        saver = tf.train.Saver()
        sess.run(tf.global_variables_initializer())
        for i in range(step):
            _train = sess.run(train_op, feed_dict={x: x_train, y_true: y_train})
            _acc = sess.run(accuracy, feed_dict={x: x_train, y_true: y_train})
            print("訓練%d步，准確率為%.2f" % (i + 1, _acc))
        print("測試集的准確率為%.2f" %sess.run(accuracy, feed_dict={x: x_test, y_true: y_test}))
        saver.save(sess, saveModelPath)
def predictIris(modelPath,data):
    """
    :param modelPath: 載入模型路徑
    :param data: 預測數據
    :return: None
    """
    with tf.Session() as sess:
        #
        new_saver = tf.train.import_meta_graph("model/iris_model.meta")
        new_saver.restore(sess,"model/iris_model")
        graph = tf.get_default_graph()
        x = graph.get_operation_by_name('data/x_pred').outputs[0]
        y = tf.get_collection("pred_network")[0]
        predict = np.argmax(sess.run(y,feed_dict={x:data}))
        if predict == 0:
            print("山鳶尾 Iris-Setosa")
        elif predict == 1:
            print("雜色鳶尾 Iris-versicolor")
        else:
            print("維吉尼亞鳶尾 Iris-virginica")


if __name__ == '__main__':
    model_path = "model/iris_model"
    # 模型訓練
    # model = getIrisModel(model_path,1000)
    # 模型預測
    # predictData = [[5.0,3.4,1.5,0.2]] # 填入數據集
    # predictIris(model_path,predictData)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 對鳶尾花識別之Keras TensorFlow實現鳶尾花分類【tensorflow】神經網絡實現鳶尾花分類使用TensorFlow原生代碼實現鳶尾花分類應用：鳶尾花分類 Python鳶尾花分類實現實驗02 鳶尾花分類 tensorflow2.0——鳶尾花分類實操（神經網絡偽代碼）分析鳶尾花數據集 KNN鳶尾花數據分類