示例代碼
import pandas as pd
import re
import csv
data = pd.read_csv('nuojia.csv', encoding='utf-8')
# print(data)
data = data.values
# 把第二列的數值數據提取出來,存入num.csv中
with open('num.csv', 'a+', newline='') as csvfile:
writer = csv.writer(csvfile)
data1 = data[:, 1]
print(data1)
# print(len(data[:, 1]))
for i in range(len(data1)):
# print(data1[i])
num = re.findall('[0-9]+', data1[i])
# print(num)
# writer.writerow(num)
注意:1,csv導入大部分是字典格式
2,列表的話一般要求,列表里面的是字符串格式
3,如果是一列數據不是字符串,寫入csv的話會報錯iterable expected, not float
那就把數據存入列表中,導入整個列表,再把數據在excel中轉置(選中數據,復制,選擇性粘貼,轉置),代碼如下:
comment_score = []
with open('comment_score.csv', 'a+', newline='') as csvfile:
writer = csv.writer(csvfile)
for i in range(1,len(columns)):
comment = columns[i]
try:
s = SnowNLP(columns[i])
print(s.sentiments)
comment_score.append(s.sentiments)
except ZeroDivisionError as e:
print('ZeroDivisionError')
print(comment_score)
writer.writerow(comment_score)
4,參考鏈接如下:https://blog.csdn.net/taotiezhengfeng/article/details/75876717
讀取csv
with open('nuojia.csv','rt', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile)
columns = [row[6] for row in reader] # 讀取第七列
不想用dataframe格式只想要數據的話就用這種方法
存入數據不知道怎么存的話就可以用dataframe
kmodel = KMeans(n_clusters=k, n_jobs=1)
kmodel.fit(data)
print(kmodel.cluster_centers_) # 查看聚類中心
center = DataFrame(kmodel.cluster_centers_)
center.to_csv('center.csv')
print(kmodel.labels_) # 查看個樣本對應的類別
label = DataFrame(kmodel.labels_)
label.to_csv('label.csv')