Python實現簡單的數據可視化

本文轉載自查看原文 2020-02-29 17:18 8369 Python

現在python這門解釋型語言被越來越多的人們喜歡，強大的庫支持，使得編程過程變得簡單。

我是一個傳統的C語言支持者，往后也打算慢慢的了解Python的強大。

今天我就學習一下使用python實現數據可視化。

參考：https://mp.weixin.qq.com/s/Nb2ci6d5MhoRoepu6G3YdQ

1 安裝依賴庫
——

◈ NumPy 用於簡化數組和矩陣的操作

◈ SciPy 用於數據科學

◈ Matplotlib 用於繪圖

在windows下我使用Pycharm作為IDE，安裝庫也十分方便，直接在包管理工具終添加即可，如果超時無法下載，可以參照我之前的博客換成國內源。

2 導入依賴包
——

import numpy as np               #使用as重命名
from scipy import stats          #可以只導入包的一部分
import matplotlib.pyplot as plt  #import matplotlib.pyplot == from matplotlib import pyplot

3 定義變量
——

python中的變量在第一次賦值時被聲明，變量類型由分配給變量的值推斷。習慣上，不使用大寫字母命名。

input_file_name = "anscombe.csv"
delimiter = "\t"                 #數據之間的分隔符
skip_header = 3                  #文件開頭要跳過的行
column_x = 0
column_y = 1

4 讀取數據
——

毫無疑問，我們要事先得到需要可視化的數據：

(這里我們只對四個部分中部分一的進行處理)

使用 NumPy 中函數 genfromtxt() 讀取 CSV 文件非常容易，該函數生成 NumPy 數組：

data = np.genfromtxt(input_file_name, delimiter = delimiter, skip_header = skip_header)

在 Python 中，一個函數可以有數量可變的參數，你可以通過指定所需的參數來傳遞一個參數的子集。數組是非常強大的矩陣狀對象，可以很容易地分割成更小的數組：

這里的：就指代全部選擇

x = data[:, column_x]           #x取所有行的column_x列
y = data[:, column_y]           #y取所有行的column_y列

5 擬合數據
——

SciPy 提供了方便的數據擬合功能，例如 linregress() 函數提供了一些與擬合相關的重要值，如斜率、截距和兩個數據集的相關系數:

slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
print("Slope: {:f}".format(slope))
print("Intercept: {:f}".format(intercept))
print("Correlation coefficient: {:f}".format(r_value))

因為 linregress() 提供了幾條信息，所以結果可以同時保存到幾個變量中。

6 繪圖
——

Matplotlib 庫僅僅繪制數據點，因此，你應該定義要繪制的點的坐標。已經定義了 x 和 y 數組，所以你可以直接繪制它們，但是你還需要更多的點來畫直線。

linspace() 函數可以方便地在兩個值之間生成一組等距值。再利用強大的 NumPy 數組可以輕松計算縱坐標，該數組可以像普通數值變量一樣在公式中使用

fit_x = np.linspace(x.min() - 1, x.max() + 1, 100)  #隨機生成100個線性數據
fit_y = slope * fit_x + intercept

要繪圖，首先，定義一個包含所有圖形的圖形對象：

fig_width = 7 #inch
fig_height = fig_width / 16 * 9 #inch
fig_dpi = 100
fig = plt.figure(figsize = (fig_width, fig_height), dpi = fig_dpi)

參數也非常好理解，最后調用figure()函數生成一個圖形。

一個圖形可以畫幾個圖；在 Matplotlib 中，這些圖被稱為軸。本示例定義一個單軸對象來繪制數據點：

ax = fig.add_subplot(111)
ax.plot(fit_x, fit_y, label = "Fit", linestyle = '-')
ax.plot(x, y, label = "Data", marker = '.', linestyle = '')
ax.legend()
ax.set_xlim(min(x) - 1, max(x) + 1)
ax.set_ylim(min(y) - 1, max(y) + 1)
ax.set_xlabel('x')
ax.set_ylabel('y')

如果要保存圖片，有：

fig.savefig('fit_python.png')

如果要顯示（而不是保存）該繪圖，請調用：

plt.show()

7 結果
——

終端輸出：

生成圖像：

怎么樣，感覺還不錯吧，Python真是個好用的工具，以后會更新更多的實用的案例~

完整代碼：

import numpy as np               #使用as重命名
from scipy import stats          #可以只導入包的一部分
import matplotlib.pyplot as plt  #import matplotlib.pyplot == from matplotlib import pyplot

input_file_name = "anscombe.csv"
delimiter = "\t"                 #數據之間的分隔符
skip_header = 2                  #文件開頭要跳過的行
column_x = 0
column_y = 1

print("#### Anscombe's first set with Python ####")

data = np.genfromtxt(input_file_name, delimiter = delimiter, skip_header = skip_header)
x = data[:, column_x]           #x取所有行的column_x列
y = data[:, column_y]           #y取所有行的column_y列

slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
print("Slope: {:f}".format(slope))
print("Intercept: {:f}".format(intercept))
print("Correlation coefficient: {:f}".format(r_value))

fit_x = np.linspace(x.min() - 1, x.max() + 1, 100)  #隨機生成100個線性數據
fit_y = slope * fit_x + intercept

fig_width = 7 #inch
fig_height = fig_width / 16 * 9 #inch
fig_dpi = 100
fig = plt.figure(figsize = (fig_width, fig_height), dpi = fig_dpi)

ax = fig.add_subplot(111)
ax.plot(fit_x, fit_y, label = "Fit", linestyle = '-')
ax.plot(x, y, label = "Data", marker = '.', linestyle = '')
ax.legend()
ax.set_xlim(min(x) - 1, max(x) + 1)
ax.set_ylim(min(y) - 1, max(y) + 1)
ax.set_xlabel('x')
ax.set_ylabel('y')

plt.show()

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 使用Python自己實現簡單的數據可視化 Python+Flourish實現簡單數據可視化 Python爬取《隱秘的角落》彈幕數據，實現簡單可視化（附源碼）簡單一招實現json數據可視化 python爬取旅游數據+matplotlib簡單可視化 python數據可視化，幾個最簡單的例子 Python 天氣簡單數據分析及可視化 python 科比投籃數據可視化及簡單分析 python -- 數據可視化（二） python --數據可視化（一）