python機器學習-乳腺癌細胞挖掘(博主親自錄制視頻)https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share
項目聯系QQ:231469242
https://github.com/thomas-haslwanter/statsintro_python/tree/master/ISP/Code_Quantlets/08_TestsMeanValues/kruskalWallis
# -*- coding: utf-8 -*- import numpy as np # additional packages from scipy.stats.mstats import kruskalwallis ''' .. currentmodule:: scipy.stats.mstats This module contains a large number of statistical functions that can be used with masked arrays. Most of these functions are similar to those in scipy.stats but might have small differences in the API or in the algorithm used. Since this is a relatively new package, some API changes are still possible. ''' # Get the data ''' #These data could be a comparison of the smog levels in four different cities. city1 = np.array([68, 93, 123, 83, 108, 122]) city2 = np.array([119, 116, 101, 103, 113, 84]) city3 = np.array([70, 68, 54, 73, 81, 68]) city4 = np.array([61, 54, 59, 67, 59, 70]) ''' group1=[27,2,4,18,7,9] group2=[20,8,14,36,21,22] group3=[34,31,3,23,30,6] list_groups=[group1,group2,group3] def Kruskawallis_test(list_groups): # Perform the Kruskal-Wallis test,返回True表示有顯著差異,返回False表示無顯著差異 print"Use kruskawallis test:" h, p = kruskalwallis(list_groups) print"H value:",h print"p",p # Print the results if p<0.05: print('There is a significant difference between the cities.') return True else: print('No significant difference between the cities.') return False Kruskawallis_test(list_groups)
當樣本數據非正態分布,兩組數對比時用mann-whitney檢驗,三組或更多時用kruskal-wallis檢驗
kruskal-wallis 是一個獨立單因素方差檢驗的版本
kruskal-wallis能用於排序計算
樣本數據
流程
H0和H1假設
自由度:組數-1,這里有三組,自由度為3-=2
自由度為2,a=0.05,對應得關鍵值5.99,如果計算的值大於5.99,拒絕原假設
對數據排序,然后把對應得排序填入表內
計算公式:
T為一組的排序之和
n為一組的個數
計算的H值2.854小於5.99,不拒絕原假設