《Python機器學習經典實例》2.9小節中,想自己動手實踐汽車特征評估質量,所以需要對數據進行預處理,其中代碼有把字符串標記編碼為對應的數字,如下代碼
input_data = ['vhigh', 'vhigh', '2', '2', 'small', 'low']
input_data_encoded = [-1] * len(input_data)
for i,item in enumerate(input_data):
input_data_encoded[i] = int(label_encoder[i].transform(input_data[i]))
報錯:
Traceback (most recent call last):
File "E:/17770426925/PythonLeaning/Machine-Learning/classifier/classifier.py", line 255, in <module>
input_data_encoded[i] = int(label_encoder[i].transform(input_data[i]))
File "D:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py", line 147, in transform
y = column_or_1d(y, warn=True)
File "D:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 562, in column_or_1d
raise ValueError("bad input shape {0}".format(shape))
ValueError: bad input shape ()
所以由此看出,是label_encoder[i].transform(input_data[i])中input_data[i]輸入的數值形式不對,需要將其改變成list,所以可對該代碼進行改進:
for i, item in enumerate(input_data):
labels=[]
labels.append(input_data[i])
input_data_encoded[i] = int(label_encoder[i].transform(labels))