Tensorflow暑期實踐——波士頓房價預測(全部代碼)


# coding: utf-8

get_ipython().run_line_magic('matplotlib', 'notebook')

import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow.contrib.learn as skflow
from sklearn.utils import shuffle
import numpy as np
import pandas as pd
import os 
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
print(tf.__version__)
print(tf.test.is_gpu_available())


# ** 數據集簡介 **

# 本數據集包含與波士頓房價相關的多個因素:<br>
# ** CRIM **:城鎮人均犯罪率<br>
# ** ZN **:住宅用地超過25000 sq.ft. 的比例<br>
# ** INDUS ** : 城鎮非零售商用土地的比例<br>
# ** CHAS **:Charles河空變量(如果邊界是河流,則為1;否則,為0)<br>
# ** NOX **:一氧化氮濃度<br>
# ** RM **:住宅平均房間數<br>
# **AGE **:1940年之前建成的自用房屋比例<br>
# ** DIS **:到波士頓5個中心區域的加權距離<br>
# ** RAD **:輻射性公路的靠近指數<br>
# ** TAX **:每1萬美元的全值財產稅率<br>
# ** PTRATIO **:城鎮師生比例<br>
# ** LSTAT **:人口中地位低下者的比例<br>
# ** MEDV **:自住房的平均房價,單位:千美元<br>

# ** 數據集以CSV格式存儲,可通過Pandas庫讀取並進行格式轉換 **

# ** Pandas庫 **可以幫助我們快速讀取常規大小的數據文件<br>
# 能夠讀取CVS文件, 文本文件、MS Excel、SQL數據庫以及用於科學用途的HDF5格式文件<br>
# 自動轉換為Numpy的多維陣列

# ** 通過Pandas導入數據 **

# In[2]:


df = pd.read_csv("data/boston.csv", header=0)
print (df.describe())


df = np.array(df)

for i in range(12):
    df[:,i] = (df[:,i]-df[:,i].min())/(df[:,i].max()-df[:,i].min())
x_data = df[:,:12]
y_data = df[:,12]




x = tf.placeholder(tf.float32, [None,12], name = "x") # 3個影響因素
y = tf.placeholder(tf.float32, [None,1], name = "y")


with tf.name_scope("Model"):
    w = tf.Variable(tf.random_normal([12,1], stddev=0.01), name="w0")
    b = tf.Variable(1., name="b0")
    def model(x, w, b):
        return tf.matmul(x, w) + b

    pred= model(x, w, b)


train_epochs = 500 # 迭代次數
learning_rate = 0.01 #學習率

with tf.name_scope("LossFunction"):
    loss_function = tf.reduce_mean(tf.pow(y-pred, 2)) #均方誤差MSE

optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_function)

sess = tf.Session()
init = tf.global_variables_initializer()

tf.train.write_graph(sess.graph, 'log2/boston','graph.pbtxt')

loss_op = tf.summary.scalar("loss", loss_function)
merged = tf.summary.merge_all()

sess.run(init)

writer = tf.summary.FileWriter('log/boston', sess.graph) 

loss_list = []
for epoch in range (train_epochs):
    loss_sum=0.0
    for xs, ys in zip(x_data, y_data):   
        z1 = xs.reshape(1,12)
        z2 = ys.reshape(1,1)
        _,loss = sess.run([optimizer,loss_function], feed_dict={x: z1, y: z2}) 
        summary_str = sess.run(loss_op, feed_dict={x: z1, y: z2})
        #lossv+=sess.run(loss_function, feed_dict={x: z1, y: z2})/506.00
        loss_sum = loss_sum + loss
       # loss_list.append(loss)
        writer.add_summary(summary_str, epoch) 
    x_data, y_data = shuffle(x_data, y_data)
    print (loss_sum)
    b0temp=b.eval(session=sess)
    w0temp=w.eval(session=sess)
    loss_average = loss_sum/len(y_data)
    loss_list.append(loss_average)
    print("epoch=", epoch+1,"loss=",loss_average,"b=", b0temp,"w=", w0temp )
    

print("y=",w0temp[0], "x1+",w0temp[1], "x2+",w0temp[2], "x3+", [b0temp])
print("y=",w0temp[0], "CRIM+", w0temp[1], 'DIS+', w0temp[2], "LSTAT+", [b0temp])

plt.plot(loss_list)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM