在實現決策樹的ID3算法時,編寫了計算信息熵的函數,運行后遇到了如下的問題:
對應處的代碼為:
@staticmethod def calc_ent(datasets): data_length = len(datasets) label_count = {} for i in range(data_length): label = datasets[i][-1] if label not in label_count: label_count[label] = 0 else: label_count[label] += 1 ent = -sum([(p / data_length) * log(p / data_length, 2) for p in label_count.values()]) return ent
經過不斷地試錯與百度,最后解決了該問題,竟然是把else:去掉:
@staticmethod def calc_ent(datasets): data_length = len(datasets) label_count = {} for i in range(data_length): label = datasets[i][-1] if label not in label_count: label_count[label] = 0 label_count[label] += 1 ent = -sum([(p / data_length) * log(p / data_length, 2) for p in label_count.values()]) return ent
運行成功:
根據報錯提示,原因應該是數學域錯誤,不知道為什么刪掉else:就成功了......