柯爾莫哥洛夫-斯米爾諾夫檢驗(Колмогоров-Смирнов檢驗)基於累計分布函數,用以檢驗兩個經驗分布是否不同或一個經驗分布與另一個理想分布是否不同。
在進行cumulative probability統計(如下圖)的時候,你怎么知道組之間是否有顯著性差異?有人首先想到單因素方差分析或雙尾檢驗(2 tailed TEST)。其實這些是不准確的,最好采用Kolmogorov-Smirnov test(柯爾莫諾夫-斯米爾諾夫檢驗)來分析變量是否符合某種分布或比較兩組之間有無顯著性差異。
在統計學中,柯爾莫可洛夫-斯米洛夫檢驗基於累計分布函數,用以檢驗兩個經驗分布是否不同或一個經驗分布與另一個理想分布是否不同。
在進行累計概率(cumulative probability)統計的時候,你怎么知道組之間是否有顯著性差異?有人首先想到單因素方差分析或雙尾檢驗(2 tailedTEST)。其實這些是不准確的,最好采用Kolmogorov-Smirnov test(柯爾莫諾夫-斯米爾諾夫檢驗)來分析變量是否符合某種分布或比較兩組之間有無顯著性差異。
分類:
1、Single sample Kolmogorov-Smirnov goodness-of-fit hypothesis test.
采用柯爾莫諾夫-斯米爾諾夫檢驗來分析變量是否符合某種分布,可以檢驗的分布有正態分布、均勻分布、Poission分布和指數分布。指令如下:
>> H = KSTEST(X,CDF,ALPHA,TAIL) % X為待檢測樣本,CDF可選:如果空缺,則默認為檢測標准正態分布;
如果填寫兩列的矩陣,第一列是x的可能的值,第二列是相應的假設累計概率分布函數的值G(x)。ALPHA是顯著性水平(默認0.05)。TAIL是表示檢驗的類型(默認unequal,不平衡)。還有larger,smaller可以選擇。
如果,H=1 則否定無效假設; H=0,不否定無效假設(在alpha水平上)
例如,
x = -2:1:4
x =
-2 -1 0 1 2 3 4
[h,p,k,c] = kstest(x,[],0.05,0)
h =
0
p =
0.13632
k =
0.41277
c =
0.48342
The test fails to reject the null hypothesis that the values come from a standard normal distribution.
2、Two-sample Kolmogorov-Smirnov test
檢驗兩個數據向量之間的分布的。
>>[h,p,ks2stat] = kstest2(x1,x2,alpha,tail)
% x1,x2都為向量,ALPHA是顯著性水平(默認0.05)。TAIL是表示檢驗的類型(默認unequal,不平衡)。
例如,x = -1:1:5
y = randn(20,1);
[h,p,k] = kstest2(x,y)
h =
0
p =
0.0774
k =
0.5214
wiki翻譯起來太麻煩,還有可能曲解本意,最好看原版解釋。
In statistics, the Kolmogorov–Smirnov test (K–S test) is a form of minimum distance estimation used as a nonparametric test of equality of one-dimensional probability distributions used to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test). The Kolmogorov–Smirnov statistic quantifies a distance between theempirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null distribution of this statistic is calculated under the null hypothesis that the samples are drawn from the same distribution (in the two-sample case) or that the sample is drawn from the reference distribution (in the one-sample case). In each case, the distributions considered under the null hypothesis are continuous distributions but are otherwise unrestricted.
The two-sample KS test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples.
The Kolmogorov–Smirnov test can be modified to serve as a goodness of fit test. In the special case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using the sample to modify the null hypothesis reduces the power of a test. Correcting for this bias leads to theLilliefors test. However, even Lilliefors' modification is less powerful than the Shapiro–Wilk test or Anderson–Darling test for testing normality.[1]
