Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match


  最近在用python做數據挖掘,在聚類的時候遇到了一個非常惡心的問題。話不多說,直接上代碼:

 1 from sklearn.cluster import KMeans
 2 from sklearn.decomposition import PCA
 3 import matplotlib.pyplot as plt  
 4 #kmeans算法
 5 df1=df23
 6 kmeans = KMeans(n_clusters=5, random_state=10).fit(df1)
 7 #貼上每個樣本對應的簇類別標簽
 8 df1['level']=kmeans.labels_
 9 #df1.to_csv('new_df.csv')
10 
11 df2=df1.groupby('level',as_index=False)['level'].agg({'num': np.size})
12 print(df2.head())
13 
14 #將用於聚類的數據的特征的維度降至2維
15 pca = PCA(n_components=2)
16 new_pca = pd.DataFrame(pca.fit_transform(df1))
17 print(new_pca.head())
18  
19 #可視化
20 d = new_pca[df1['level'] == 0]
21 plt.plot(d[0], d[1], 'gv')
22 d = new_pca[df1['level'] == 1]
23 plt.plot(d[0], d[1], 'ko')
24 d = new_pca[df1['level'] == 2]
25 plt.plot(d[0], d[1], 'b*')
26 d = new_pca[df1['level'] == 3]
27 plt.plot(d[0], d[1], 'y+')
28 d = new_pca[df1['level'] == 4]
29 plt.plot(d[0], d[1], 'c.')
30 
31 plt.title('the result of polymerization')
32 plt.show()

   錯誤如下:

  網上找了好久都沒找到解決方法,明明之前成功過的。於是我查看了df23數據,發現它是這樣的:

 與之前成功的dataframe的唯一差別就是索引!!!重要的事情說三遍!!!索引!!!索引!!!於是乎,我去找怎么重置索引的方法,見代碼:

1 df24=df23[["forks_count","has_issues","has_wiki","open_issues_count","stargazers_count","watchers_count","created_pushed_time","created_updated_time"]]
2 df24=df24.reset_index()
3 df24=df24[["forks_count","has_issues","has_wiki","open_issues_count","stargazers_count","watchers_count","created_pushed_time","created_updated_time"]]

  然后聚類就成功了。。。心累。。。。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM