TensorFlow2.0教程12：回歸問題

本文轉載自查看原文 2019-08-31 14:42 707 tensorflow2.0

　　在回歸問題中，我們的目標是預測連續值的輸出，如價格或概率。

　　我們采用了經典的Auto MPG數據集，並建立了一個模型來預測20世紀70年代末和80年代初汽車的燃油效率。為此，我們將為該模型提供該時段內許多汽車的描述。此描述包括以下屬性：氣缸，排量，馬力和重量。

　　1.Auto MPG數據集

　　獲取數據

　　dataset_path = keras.utils.get_file('auto-mpg.data',

　　'https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data')

　　print(dataset_path)

　　Downloading data from https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data

　　32768/30286 [================================] - 1s 25us/step

　　/home/czy/.keras/datasets/auto-mpg.data

　　使用pandas讀取數據

　　column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',

　　'Acceleration', 'Model Year', 'Origin']

　　raw_dataset = pd.read_csv(dataset_path, names=column_names,

　　na_values='?', comment='\t',

　　sep=' ', skipinitialspace=True)

　　dataset = raw_dataset.copy()

　　dataset.tail()

　　MPG　　Cylinders　　Displacement　　Horsepower　　Weight　　Acceleration　　Model Year　　Origin

　　393　　27.0　　4　　140.0　　86.0　　2790.0　　15.6　　82　　1

　　394　　44.0　　4　　97.0　　52.0　　2130.0　　24.6　　82　　2

　　395　　32.0　　4　　135.0　　84.0　　2295.0　　11.6　　82　　1

　　396　　28.0　　4　　120.0　　79.0　　2625.0　　18.6　　82　　1

　　397　　31.0　　4　　119.0　　82.0　　2720.0　　19.4　　82　　1

　　2.數據預處理

　　清洗數據

　　print(dataset.isna().sum())

　　dataset = dataset.dropna()

　　origin = dataset.pop('Origin')

　　dataset['USA'] = (origin == 1)*1.0

　　dataset['Europe'] = (origin == 2)*1.0

　　dataset['Japan'] = (origin == 3)*1.0

　　dataset.tail()

　　MPG 0

　　Cylinders 0

　　Displacement 0

　　Horsepower 6

　　Weight 0

　　Acceleration 0

　　Model Year 0

　　Origin 0

　　dtype: int64

　　MPG　　Cylinders　　Displacement　　Horsepower　　Weight　　Acceleration　　Model Year　　USA　　Europe　　Japan

　　393　　27.0　　4　　140.0　　86.0　　2790.0　　15.6　　82　　1.0　　0.0　　0.0

　　394　　44.0　　4　　97.0　　52.0　　2130.0　　24.6　　82　　0.0　　1.0　　0.0

　　395　　32.0　　4　　135.0　　84.0　　2295.0　　11.6　　82　　1.0　　0.0　　0.0

　　396　　28.0　　4　　120.0　　79.0　　2625.0　　18.6　　82　　1.0　　0.0　　0.0

　　397　　31.0　　4　　119.0　　82.0　　2720.0　　19.4　　82　　1.0　　0.0　　0.0

　　划分訓練集和測試集

　　train_dataset = dataset.sample(frac=0.8,random_state=0)

　　test_dataset = dataset.drop(train_dataset.index)

　　檢測數據

　　觀察訓練集中幾對列的聯合分布。

　　sns.pairplot(train_dataset[["MPG", "Cylinders", "Displacement", "Weight"]], diag_kind="kde")

　　整體統計數據：

　　train_stats = train_dataset.describe()

　　train_stats.pop("MPG")

　　train_stats = train_stats.transpose()

　　train_stats

　　count　　mean　　std　　min　　25%　　50%　　75%　　max

　　Cylinders　　314.0　　5.477707　　1.699788　　3.0　　4.00　　4.0　　8.00　　8.0

　　Displacement　　314.0　　195.318471　　104.331589　　68.0　　105.50　　151.0　　265.75　　455.0

　　Horsepower　　314.0　　104.869427　　38.096214　　46.0　　76.25　　94.5　　128.00　　225.0

　　Weight　　314.0　　2990.251592　　843.898596　　1649.0　　2256.50　　2822.5　　3608.00　　5140.0

　　Acceleration　　314.0　　15.559236　　2.789230　　8.0　　13.80　　15.5　　17.20　　24.8

　　Model Year　　314.0　　75.898089　　3.675642　　70.0　　73.00　　76.0　　79.00　　82.0

　　USA　　314.0　　0.624204　　0.485101　　0.0　　0.00　　1.0　　1.00　　1.0

　　Europe　　314.0　　0.178344　　0.383413　　0.0　　0.00　　0.0　　0.00　　1.0

　　Japan　　314.0　　0.197452　　0.398712　　0.0　　0.00　　0.0　　0.00　　1.0

　　取出標簽

　　train_labels = train_dataset.pop('MPG')

　　test_labels = test_dataset.pop('MPG')

　　標准化數據

　　最好使用不同比例和范圍的特征進行標准化。雖然模型可能在沒有特征歸一化的情況下收斂，但它使訓練更加困難，並且它使得結果模型依賴於輸入中使用的單位的選擇。

　　def norm(x):

　　return (x - train_stats['mean']) / train_stats['std']

　　normed_train_data = norm(train_dataset)

　　normed_test_data = norm(test_dataset)

　　3.構建模型

　　def build_model():

　　model = keras.Sequential([

　　layers.Dense(64, activation='relu', input_shape=[len(train_dataset.keys())]),

　　layers.Dense(64, activation='relu'),

　　layers.Dense(1)

　　])

　　optimizer = tf.keras.optimizers.RMSprop(0.001)

　　model.compile(loss='mse',

　　optimizer=optimizer,

　　metrics=['mae', 'mse'])

　　return model

　　model = build_model()

　　model.summary()

　　Model: "sequential"

　　_________________________________________________________________

　　Layer (type) Output Shape Param #

　　=================================================================

　　dense (Dense) (None, 64) 640

　　_________________________________________________________________

　　dense_1 (Dense) (None, 64) 4160

　　_________________________________________________________________

　　dense_2 (Dense) (None, 1) 65

　　=================================================================

　　Total params: 4,865

　　Trainable params: 4,865

　　Non-trainable params: 0

　　_________________________________________________________________

　　example_batch = normed_train_data[:10]

　　example_result = model.predict(example_batch)

　　example_result

　　array([[0.18062565],

　　[0.1714489 ],

　　[0.22555563],

　　[0.29366603],

　　[0.69764495],

　　[0.08851457],

　　[0.6851174 ],

　　[0.32245407],

　　[0.02959149],

　　[0.38945067]], dtype=float32)

　　4.訓練模型

　　class PrintDot(keras.callbacks.Callback):

　　def on_epoch_end(self, epoch, logs):

　　if epoch % 100 == 0: print('')

　　print('.', end='')

　　EPOCHS = 1000

　　history = model.fit(

　　normed_train_data, train_labels,

　　epochs=EPOCHS, validation_split = 0.2, verbose=0,

　　callbacks=[PrintDot()])

　　....................................................................................................

　　查看訓練記錄無錫婦科醫院哪家好 http://www.ytsgfk120.com/

　　hist = pd.DataFrame(history.history)

　　hist['epoch'] = history.epoch

　　hist.tail()

　　loss　　mae　　mse　　val_loss　　val_mae　　val_mse　　epoch

　　995　　2.191127　　0.940755　　2.191127　　10.422818　　2.594117　　10.422818　　995

　　996　　2.113679　　0.903680　　2.113679　　10.723925　　2.631320　　10.723926　　996

　　997　　2.517261　　0.989557　　2.517261　　9.497868　　2.379198　　9.497869　　997

　　998　　2.250272　　0.931618　　2.250272　　11.017041　　2.658538　　11.017041　　998

　　999　　1.976393　　0.853547　　1.976393　　9.890977　　2.491739　　9.890977　　999

　　def plot_history(history):

　　hist = pd.DataFrame(history.history)

　　hist['epoch'] = history.epoch

　　plt.figure()

　　plt.xlabel('Epoch')

　　plt.ylabel('Mean Abs Error [MPG]')

　　plt.plot(hist['epoch'], hist['mae'],

　　label='Train Error')

　　plt.plot(hist['epoch'], hist['val_mae'],

　　label = 'Val Error')

　　plt.ylim([0,5])

　　plt.legend()

　　plt.figure()

　　plt.xlabel('Epoch')

　　plt.ylabel('Mean Square Error [$MPG^2$]')

　　plt.plot(hist['epoch'], hist['mse'],

　　label='Train Error')

　　plt.plot(hist['epoch'], hist['val_mse'],

　　label = 'Val Error')

　　plt.ylim([0,20])

　　plt.legend()

　　plt.show()

　　plot_history(history)

　　使用early stop

　　model = build_model()

　　early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)

　　history = model.fit(normed_train_data, train_labels, epochs=EPOCHS,

　　validation_split = 0.2, verbose=0, callbacks=[early_stop, PrintDot()])

　　plot_history(history)

　　.........................................................

　　測試

　　loss, mae, mse = model.evaluate(normed_test_data, test_labels, verbose=0)

　　print("Testing set Mean Abs Error: {:5.2f} MPG".format(mae))

　　Testing set Mean Abs Error: 1.85 MPG

　　5.預測

　　test_predictions = model.predict(normed_test_data).flatten()

　　plt.scatter(test_labels, test_predictions)

　　plt.xlabel('True Values [MPG]')

　　plt.ylabel('Predictions [MPG]')

　　plt.axis('equal')

　　plt.axis('square')

　　plt.xlim([0,plt.xlim()[1]])

　　plt.ylim([0,plt.ylim()[1]])

　　_ = plt.plot([-100, 100], [-100, 100])

　　error = test_predictions - test_labels

　　plt.hist(error, bins = 25)

　　plt.xlabel("Prediction Error [MPG]")

　　_ = plt.ylabel("Count")

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 tensorflow2.0 安裝教程 Tensorflow2.0學習（2）---線性回歸和分類 TensorFlow2.0教程1：keras 函數api TensorFlow2.0教程26：張量及其操作 TensorFlow2.0教程5：eager模式 TensorFlow2.0 教程8：圖像分類 TensorFlow2.0教程9：文本分類 TensorFlow2.0教程28：自動求導 tensorflow2.0 在pycharm下提示問題 TensorFlow2.0（12）：模型保存與序列化