Keras 處理 不平衡的數據的分類問題 imbalance data 或者 highly skewed data


處理不平衡的數據集的時候,可以使用對數據加權來提高數量較小類的被選中的概率,具體方式如下

 

fit(self, x, y, batch_size=32, nb_epoch=10, verbose=1, callbacks=[], validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None)

class_weight:字典,將不同的類別映射為不同的權值,該參數用來在訓練過程中調整損失函數(只能用於訓練)

sample_weight:權值的numpy array,用於在訓練時調整損失函數(僅用於訓練)。可以傳遞一個1D的與樣本等長的向量用於對樣本進行1對1的加權,或者在面對時序數據時,傳遞一個的形式為(samples,sequence_length)的矩陣來為每個時間步上的樣本賦不同的權。這種情況下請確定在編譯模型時添加了sample_weight_mode=’temporal’。

 

具體使用可以如下:

設置不同累的權值,如下:類0,權值1;類1,權值50

cw = {0: 1, 1: 50}

訓練模型

model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1,callbacks=cbks,validation_data=(x_test, y_test), shuffle=True,class_weight=cw)

 

如果僅僅是類不平衡,則使用class_weight,sample_weights則是類內樣本之間還不平衡的時候使用。

class_weight affects the relative weight of each class in the calculation of the objective function.

sample_weights, as the name suggests, allows further control of the relative weight of samples that belong to the same class.

 Class weights are useful when training on highly skewed data sets; for example, a classifier to detect fraudulent transactions.

Sample weights are useful when you don't have equal confidence in the samples in your batch. A common example is performing regression on measurements with variable uncertainty.

 

 

 https://datascience.stackexchange.com/questions/13490/how-to-set-class-weights-for-imbalanced-classes-in-keras

http://blog.csdn.net/lk7688535/article/details/52875046

https://stackoverflow.com/questions/38891390/keras-lstm-with-class-weights

https://stackoverflow.com/questions/43459317/keras-class-weight-vs-sample-weights-in-the-fit-generator

https://stackoverflow.com/questions/41648129/balancing-an-imbalanced-dataset-with-keras-image-generator

https://stackoverflow.com/questions/41815354/keras-flow-from-directory-over-or-undersample-a-class

http://www.ijetae.com/files/Volume2Issue4/IJETAE_0412_07.pdf

http://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

https://stackoverflow.com/questions/44666910/keras-image-preprocessing-unbalanced-data

http://blog.csdn.net/u011401509/article/details/52625014

https://www.analyticsvidhya.com/blog/2016/09/this-machine-learning-project-on-imbalanced-data-can-add-value-to-your-resume/


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM