『原創』統計建模與R軟件-第五章假設檢驗

本文轉載自查看原文 2015-12-25 08:26 6352 R語言

摘要: 本文由digging4發表於：http://www.cnblogs.com/digging4/p/5054603.html

統計建模與R軟件-第五章假設檢驗

5.1正常男子血小板計數均值為\(225*10^9/L\)，今測得20名男性油漆工人的血小板計數值（單位：\(10^9/L\)）:220,188 ,162 ,230 ,145 ,160 ,238 ,188 ,247 ,113,126 ,245 ,164 ,231 ,256 ,183 ,190 ,158 ,224 ,175。問油漆工人的血小板計數與正常成人男子有無差異？

##
## t.test(x,y=NULL,...)提供了t檢驗和相應的區間估計的功能，x,y是數據向量，如果y為空，則作單個正態總體的均值檢驗，否則作兩個總體的均值檢驗
x <- c(220, 188, 162, 230, 145, 160, 238, 188, 247, 113, 126, 245, 164, 231, 
    256, 183, 190, 158, 224, 175)
t.test(x, alternative = "two.sided", mu = 225)

## 
## 	One Sample t-test
## 
## data:  x 
## t = -3.478, df = 19, p-value = 0.002516
## alternative hypothesis: true mean is not equal to 225 
## 95 percent confidence interval:
##  172.4 211.9 
## sample estimates:
## mean of x 
##     192.2

# 得到結論alternative hypothesis: true mean is not equal to 225
# 95%的置信區間為[172.4,211.9]，均值估計為192.2

5.2已知某種燈泡壽命服從正態分布，在某星期所生產的該燈泡中隨機抽取10只，測得其壽命（單位：小時）為：1067 ,919 ,1196 ,785 ,1126 ,936 ,918 ,1156 ,920 ,948。求這個星期生成出的燈泡能使用1000小時以上的概率。

## alternative='greater' ,表示備選假設H1：u>u0
x <- c(1067, 919, 1196, 785, 1126, 936, 918, 1156, 920, 948)
t.test(x, alternative = "greater", mu = 1000)

## 
## 	One Sample t-test
## 
## data:  x 
## t = -0.0697, df = 9, p-value = 0.527
## alternative hypothesis: true mean is greater than 1000 
## 95 percent confidence interval:
##  920.8   Inf 
## sample estimates:
## mean of x 
##     997.1

##
## 95%的置信區間為[920.8,Inf)，均值估計為997.1，不能認為平均使用1000小時以上

# 使用用pnorm函數來計算大於1000的概率，P(x<1000) =
# pnorm(1000,mean(x),sd(x))表示負無窮到1000的積分，后面兩個參數是正態分布函數的均值和標准差
1 - pnorm(1000, mean(x), sd(x))

## [1] 0.4912

5.3為研究某鐵劑治療和飲食治療營養性缺鐵性貧血的效果，將16名患者按年齡、體重、病程和病情相近的原則配成8對，分別使用飲食療法和補充鐵劑治療的方法，3個月后測得兩種患者血紅蛋白如表5.19所示，問兩種方法治療后的患者血紅蛋白有無差異？

表5.19：鐵劑和飲食兩種方法治療后患者血紅蛋白值（\(g/L\)）
鐵劑治療組 113 120 138 120 100 118 138 123
飲食治療組 138 116 125 136 110 132 130 110

# 兩個總體的情況(方差未知)，檢驗均值是否相等 var.qual='FALSE'
# 默認兩總體方差不等，paired=TRUE表示數據成對出現
x <- c(113, 120, 138, 120, 100, 118, 138, 123)
y <- c(138, 116, 125, 136, 110, 132, 130, 110)
t.test(x, y, alternative = "two.sided", var.qual = "FALSE", paired = TRUE)

## 
## 	Paired t-test
## 
## data:  x and y 
## t = -0.6513, df = 7, p-value = 0.5357
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -15.629   8.879 
## sample estimates:
## mean of the differences 
##                  -3.375

#
# 均值差的95%置信區間為[-15.629,8.879]，該區間包含0，因此不能認為兩總體有均值有差異。

5.4為研究國產四類新葯阿卡波糖膠囊效果，某醫院用40名二型糖尿病病人進行同期隨機對照實驗，實驗者將這些病人隨機等分到試驗組（阿卡波糖膠囊組）和對照組（拜唐蘋膠囊組），分布測得實驗開始前和8周后空腹血糖，算得空腹血糖下降值，如表5.20所示，能否認為國產四類新葯阿卡波糖膠囊與拜唐蘋膠囊對空腹血糖的降糖效果不同？

表5.20: 試驗組與對照組空腹血糖下降值（\(mmol/L\))
試驗組(n1 = 20)：-0.70,-5.60,2.00,2.80,0.70,3.50,4.00,5.80,7.10,-0.50,2.50 ,-1.60 ,1.70 ,3.00 ,0.40 ,4.50 ,4.60 ,2.50 ,6.00 ,-1.40
對照組(n2 = 20)：6.50 ,5.00 ,5.20 ,0.80 ,0.20 ,0.60 ,3.40 ,6.60 ,-1.10,6.00 ,3.80 ,2.00 ,1.60 ,2.00 ,2.20 ,1.20 ,3.10 ,1.70 ,-2.00
(1)檢驗試驗組和對照組的數據是否來自正態分布，采用正態性W檢驗方法（見第三章），Kolmogorov-Smirnov檢驗方法和Pearson擬合優度\(\chi^2\) 檢驗；
(2)用t-檢驗兩組數據均值是否有差異，分別用方差相同模型、方差不同模型和成對t-檢驗模型；
(3)檢驗試驗組與對照組的方差是否相同。

x <- c(-0.7, -5.6, 2, 2.8, 0.7, 3.5, 4, 5.8, 7.1, -0.5, 2.5, -1.6, 1.7, 3, 0.4, 
    4.5, 4.6, 2.5, 6, -1.4)
y <- c(6.5, 5, 5.2, 0.8, 0.2, 0.6, 3.4, 6.6, -1.1, 6, 3.8, 2, 1.6, 2, 2.2, 1.2, 
    3.1, 1.7, -2, -1)
# （1）a：正態性W檢驗方法，p值分別為
# 0.7527，0.6546，均大於0.05，可認為樣本來自正態分布，國家標准推薦國標GB/T
# 4881-2001《數據的統計處理和解釋——正態性檢驗》
shapiro.test(x)

## 
## 	Shapiro-Wilk normality test
## 
## data:  x 
## W = 0.9699, p-value = 0.7527

shapiro.test(y)

## 
## 	Shapiro-Wilk normality test
## 
## data:  y 
## W = 0.9619, p-value = 0.5816

#
# (1）b：Kolmogorov-Smirnov檢驗方法，p值都小於0.05，不認為x，y和pnorm是同一分布
ks.test(x, "pnorm")

## 
## 	One-sample Kolmogorov-Smirnov test
## 
## data:  x 
## D = 0.6054, p-value = 8.578e-07
## alternative hypothesis: two-sided

ks.test(y, "pnorm")

## 
## 	One-sample Kolmogorov-Smirnov test
## 
## data:  y 
## D = 0.5952, p-value = 1.402e-06
## alternative hypothesis: two-sided

# (1) C: Pearson擬合優度檢驗

# (2) t.test() 做兩總體均值檢驗
t.test(x, y, var.equal = TRUE)  #方差相同模型

## 
## 	Two Sample t-test
## 
## data:  x and y 
## t = -0.3657, df = 38, p-value = 0.7166
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -2.124  1.474 
## sample estimates:
## mean of x mean of y 
##     2.065     2.390

t.test(x, y, var.equal = FALSE)  #方差不同模型

## 
## 	Welch Two Sample t-test
## 
## data:  x and y 
## t = -0.3657, df = 36.73, p-value = 0.7167
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -2.126  1.476 
## sample estimates:
## mean of x mean of y 
##     2.065     2.390

t.test(x, y, paired = TRUE)  #成對t-檢驗模型

## 
## 	Paired t-test
## 
## data:  x and y 
## t = -0.3199, df = 19, p-value = 0.7526
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -2.452  1.802 
## sample estimates:
## mean of the differences 
##                  -0.325


# (3) var.test() 做方差比的檢驗和相應的區間估計
var.test(x, y)

## 
## 	F test to compare two variances
## 
## data:  x and y 
## F = 1.456, num df = 19, denom df = 19, p-value = 0.4203
## alternative hypothesis: true ratio of variances is not equal to 1 
## 95 percent confidence interval:
##  0.5763 3.6786 
## sample estimates:
## ratio of variances 
##              1.456

5.5為研究某種新葯對抗凝血酶活力的影響，隨機安排新葯組病人12例，對照組病人10例，分布測定其抗凝血酶活力（單位：\(mm^3\)），其結果如下：新葯組：126 ,125 ,136 ,128 ,123 ,138 ,142 ,116 ,110 ,108 ,115 ,140; 對照組：162 ,172 ,177 ,170 ,175 ,152 ,157 ,159 ,160 ,162。試分析新葯組和對照組病人的抗凝血酶活力有無差別（\(\alpha=0.05\)）

(1)檢驗兩組數據是否服從正態分布
(2)檢驗兩組樣本方差是否相同
(3)選擇最合適的檢驗方法檢驗新葯組和對照組病人的抗凝血酶活力有無差別。

# (1) p值分別為 0.4934，0.5313 大於0.05，可認為符合正態分布
x <- c(126, 125, 136, 128, 123, 138, 142, 116, 110, 108, 115, 140)
y <- c(162, 172, 177, 170, 175, 152, 157, 159, 160, 162)
shapiro.test(x)

## 
## 	Shapiro-Wilk normality test
## 
## data:  x 
## W = 0.9396, p-value = 0.4934

shapiro.test(y)

## 
## 	Shapiro-Wilk normality test
## 
## data:  y 
## W = 0.938, p-value = 0.5313


# (2) 方差比的95%置信區間為[0.5022 ,7.0489] 可認為兩樣本方差相同
var.test(x, y)

## 
## 	F test to compare two variances
## 
## data:  x and y 
## F = 1.965, num df = 11, denom df = 9, p-value = 0.32
## alternative hypothesis: true ratio of variances is not equal to 1 
## 95 percent confidence interval:
##  0.5022 7.0489 
## sample estimates:
## ratio of variances 
##              1.965


# (3) 均值差的 95 percent confidence interval: -48.25 -29.78
# ，可認為兩組樣本均值有差別
t.test(x, y, var.equal = TRUE)  #方差相同模型

## 
## 	Two Sample t-test
## 
## data:  x and y 
## t = -8.815, df = 20, p-value = 2.524e-08
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##  -48.25 -29.78 
## sample estimates:
## mean of x mean of y 
##     125.6     164.6

5.6一項調查顯示某城市老年人口比重為\(14.7%\)，該市老年研究協會為了檢驗該項調查是否可靠，隨機抽選了400名居民，發現其中有57人是老年人，問調查結果是否支撐該市老年人口比重為\(14.7%\)的看法（\(\alpha=0.05\)）。

# 調查中，老年人和非老年人的比為
# 0.147：1-0.147，隨機抽取中老年人和非老年的人數比為57：500-57，采用卡方檢驗
# p-value = 0.03717 < 0.05 拒絕原假設，即認為老年人比重不為14.7%
chisq.test(c(57, 500 - 57), p = c(0.147, 1 - 0.147))

## 
## 	Chi-squared test for given probabilities
## 
## data:  c(57, 500 - 57) 
## X-squared = 4.342, df = 1, p-value = 0.03717

5.7作性別控制試驗，經某種處理后，共是雛雞328只，其總公雛150只，母雛178只，試問這種處理能否着增加母雛的比例？（性別比應為1:1）

# p-value =
# 0.1221>0.05，接受原假設，認為性別比為1：1，即這種處理不能增加母雛的比例
chisq.test(c(150, 178), p = c(0.5, 0.5))

## 
## 	Chi-squared test for given probabilities
## 
## data:  c(150, 178) 
## X-squared = 2.39, df = 1, p-value = 0.1221

5.8Mendel用豌豆的兩對相對性狀進行雜交實驗，黃色園滑種子與綠色皺縮種的豌豆雜交后，第二代根據自由組合規律，理論分離比為黃圓:黃皺:綠圓:綠皺=9/16:3/16:3/16:1/16。實際實驗值為：黃圓15粒，黃皺101粒，綠圓108粒，綠皺32粒，共556粒，問此結果是否符合自由組合規律？

# p-value < 2.2e-16<0.05, 拒絕原假設，結果不否符合自由組合規
chisq.test(c(15, 101, 108, 32), p = c(9, 3, 3, 1)/16)

## 
## 	Chi-squared test for given probabilities
## 
## data:  c(15, 101, 108, 32) 
## X-squared = 265.1, df = 3, p-value < 2.2e-16

5.9觀察每分鍾進入某商店的人數X，任取200分鍾，所得數據如下：顧客人數 0 ,1 ,2 ,3 ,4 ,5 ；對應頻數 92, 68, 28, 11, 1, 0.試分析，能否認為每分鍾顧客數X服從Poisson分布（\(\alpha=0.1\)）

x <- c(0, 1, 2, 3, 4, 5)
y <- c(92, 68, 28, 11, 1, 0)
# 因為y的最后一組的頻數小於5，卡方檢驗為出錯，需要把最后兩組和前面的合並
y <- c(92, 68, 28, 12)
# 計算泊松分布的理論分布概率，其中，mean(rep(x,y))為樣本均值
q <- ppois(x, mean(c(rep(0, 92), rep(1, 68), rep(2, 28), rep(3, 11), rep(4, 
    1), rep(5, 0))))
# p-value = 0.8227>0.1。可認為服從泊松分布
chisq.test(c(92, 68, 28, 12), p = c(q[1], q[2] - q[1], q[3] - q[2], 1 - q[3]))

## 
## 	Chi-squared test for given probabilities
## 
## data:  c(92, 68, 28, 12) 
## X-squared = 0.9113, df = 3, p-value = 0.8227

5.10觀察得兩樣本值如下：2.36 ,3.14 ,7.52, 3.48, 2.76, 5.43, 6.54, 7.41; 和 4.38,4.25, 6.53, 3.28, 7.21, 6.55. 試分析兩樣本是否來自同一總體（\(\alpha=0.05\)）。

# chisq.test的原假設是兩變量獨立，p值大於0.05，接受原假設 p-value = 0.6374
# >0.05。接受原假設，即認為兩個樣本來自同一總體
x <- c(2.36, 3.14, 7.52, 3.48, 2.76, 5.43, 6.54, 7.41)
y <- c(4.38, 4.25, 6.53, 3.28, 7.21, 6.55)
ks.test(x, y)

## 
## 	Two-sample Kolmogorov-Smirnov test
## 
## data:  x and y 
## D = 0.375, p-value = 0.6374
## alternative hypothesis: two-sided

5.11為研究分娩過程中使用胎兒電子檢測儀對剖腹產率有無影響，對5824例分娩的經產婦進行回顧性調查，結果如表5.12所示，試進行分析

剖腹產胎兒電子檢測儀合計
------------使用未使用
是 358 229 587
否 2492 2745 5237
合計 2850 2974 5824

# chisq.test的原假設是兩變量獨立，p值大於0.05，接受原假設 p-value =
# 7.263e-10 <
# 0.05，因此拒絕原假設，也就是說使用胎兒電子檢測儀對剖腹產率有影響
x <- c(358, 2492, 229, 2745)
dim(x) <- c(2, 2)
chisq.test(x, correct = FALSE)

## 
## 	Pearson's Chi-squared test
## 
## data:  x 
## X-squared = 37.95, df = 1, p-value = 7.263e-10

5.12在高中一年級男生中抽取300名考察其兩個屬性：B是1500米長跑，C是每天平均鍛煉時間，得到4×3列聯表，如表5.22所示，試對\(\alpha=0.05\)檢驗B與C是否獨立。

表5.22：300名高中學生體育鍛煉的考察結果
1500米長跑記錄鍛煉時間2小時以上 1～2小時 1小時以下合計
5''01'-5''30' 45 12 10 67
5''31'-6''00' 46 20 28 94
6''00'-6''30' 28 23 30 81
6''31'-7''00' 11 12 35 58
合計 130 67 103 300

# chisq.test的原假設是兩變量獨立，p值大於0.05，接受原假設 p-value =
# 6.427e-06 <0.05，拒絕原假設，認為兩變量不獨立
x <- c(45, 46, 28, 11, 12, 20, 23, 12, 10, 28, 30, 35, 67, 94, 81, 58)
dim(x) <- c(4, 4)
chisq.test(x, correct = FALSE)

## 
## 	Pearson's Chi-squared test
## 
## data:  x 
## X-squared = 40.4, df = 9, p-value = 6.427e-06

5.13為比較兩種工藝對產品的質量是否有影響，對其產品進行抽樣檢查，其結果如表5.23所示，試進行分析。

表5.23 兩種工藝下產品質量的抽查結果
-------------合格--------不合格--------------合計
工藝一-------3-----------4-------------------7
工藝二------6------------4-------------------10
合計---------9-----------8-------------------17

#
# 有一個單元頻數小於5，應做fisher.test檢驗，原假設是兩變量獨立，p值大於0.05，接受原假設
# p-value = 0.6372>0.05,接受原假設，認為工藝和產品是否合格獨立
x <- c(3, 6, 4, 4)
dim(x) <- c(2, 2)
fisher.test(x)

## 
## 	Fisher's Exact Test for Count Data
## 
## data:  x 
## p-value = 0.6372
## alternative hypothesis: true odds ratio is not equal to 1 
## 95 percent confidence interval:
##  0.04624 5.13272 
## sample estimates:
## odds ratio 
##     0.5213

5.14 應用核素法和對比法檢測147例冠心病患者心臟收縮運動的符合情況，其結果如表5.24所示，試分析這兩種方法測定的結果是否相同。

表5.24 兩法檢查室壁收縮運動的符合情況
-------------------------核素法-------------------------合計
對比法-----------正常--------減弱----------異常---------
正常------------58------------2------------3------------63
減弱------------1-------------42-----------7------------50
異常------------8-------------9------------17-----------34
合計------------67------------53-----------27-----------147

#
# 有一個單元頻數小於5，應做fisher.test檢驗，原假設是兩變量獨立，p值大於0.05，接受原假設
# p-value < 2.2e-16 <0.05，拒絕原假設，兩種方法測定的結果不相同
x <- c(58, 1, 8, 2, 42, 9, 3, 7, 17)
dim(x) <- c(3, 3)
fisher.test(x)

## 
## 	Fisher's Exact Test for Count Data
## 
## data:  x 
## p-value < 2.2e-16
## alternative hypothesis: two.sided

5.15在某養魚塘中，根據過去經驗，魚的長度的中位數為14.6cm，現對魚塘中魚的長度進行一次估測，隨機地從魚塘中取出10條魚長度如下：13.32,13.06, 14.02, 11.86, 13.58 ,13.77, 13.51, 14.42 ,14.44 ,15.43. 將它們作為一個樣本進行檢驗，試分析，該魚塘中魚的長度是中位數之上，還是在中位數之下。（1）用符號檢驗分析；（2）用Wilcoxon符號秩檢驗。

# binom.test檢驗樣本的中位數，sum(x>14.6)樣本中大於14.6的個數， al=“l”
# 表示，H0：M>=14.6, M<14.6, M為樣本的中位數 p-value = 0.01074<0.05,
# 拒絕原假設，即認為樣本的中位數小於14.6
x <- c(13.32, 13.06, 14.02, 11.86, 13.58, 13.77, 13.51, 14.42, 14.44, 15.43)
binom.test(sum(x > 14.6), length(x), al = "l")

## 
## 	Exact binomial test
## 
## data:  sum(x > 14.6) and length(x) 
## number of successes = 1, number of trials = 10, p-value = 0.01074
## alternative hypothesis: true probability of success is less than 0.5 
## 95 percent confidence interval:
##  0.0000 0.3942 
## sample estimates:
## probability of success 
##                    0.1


# 符號檢驗，只比較了差值大小，而忽略了差值的絕對值
# Wilcoxon符合秩檢驗，彌補了這一缺點 Wilcoxon符合秩檢驗 H0：M>=mu,
# M<mu，M為樣本的中位數
# exact表示是否計算精確的p值，樣本量小時，該參數起作用 p-value =
# 0.01087<0.05，拒絕原假設，即認為樣本的中位數小於14.6
wilcox.test(x, mu = 14.6, alternative = "less", exact = FALSE)

## 
## 	Wilcoxon signed rank test with continuity correction
## 
## data:  x 
## V = 4.5, p-value = 0.01087
## alternative hypothesis: true location is less than 14.6

5.16用兩種不同的測定方法，測定同一種中草葯的有效成分，共重復20次，得到實驗結果如表5.25所示。

表5.25 兩種不同測定方法得到的結果
方法A： 48.0, 33.0, 37.5, 48.0, 42.5 ,40.0 ,42.0, 36.0 ,11.3 ,22.0 , 36.0, 27.3, 14.2 ,32.1 ,52.0, 38.0 ,17.3, 20.0 ,21.0 ,46.1
方法B: 37.0, 41.0 ,23.4, 17.0, 31.5 ,40.0 ,31.0 ,36.0, 5.7, 11.5,21.0, 6.1, 26.5, 21.3, 44.5 ,28.0 ,22.6 ,20.0, 11.0, 22.3
（1）試用符號檢驗法檢驗兩測定有無顯著差異
（2）試用Wilcoxon符號秩檢驗法檢驗兩測定有無顯著差異
（3）試用Wilcoxon秩和檢驗法檢驗兩測定有無顯著差異
（4）對數據做正態性和方差齊性檢驗，該數據是否做t-檢驗，如果能，請做t-檢驗
（5）分析各種的檢驗方法，試說明哪種檢驗法效果最好

x <- c(48, 33, 37.5, 48, 42.5, 40, 42, 36, 11.3, 22, 36, 27.3, 14.2, 32.1, 52, 
    38, 17.3, 20, 21, 46.1)
y <- c(37, 41, 23.4, 17, 31.5, 40, 31, 36, 5.7, 11.5, 21, 6.1, 26.5, 21.3, 44.5, 
    28, 22.6, 20, 11, 22.3)
# p-value = 0.1153>0.05，無法拒絕原假設，即認為兩次無顯著差異 95 percent
# confidence interval: 0.4572 0.8811，包含0.5，表示x<y x>=y的概率各占0.5
binom.test(sum(x > y), length(x))

## 
## 	Exact binomial test
## 
## data:  sum(x > y) and length(x) 
## number of successes = 14, number of trials = 20, p-value = 0.1153
## alternative hypothesis: true probability of success is not equal to 0.5 
## 95 percent confidence interval:
##  0.4572 0.8811 
## sample estimates:
## probability of success 
##                    0.7


# p-value = 0.005191 < 0.05，拒絕原假設，即認為兩次測定有顯著差異
wilcox.test(x, y, paired = TRUE, exact = FALSE)

## 
## 	Wilcoxon signed rank test with continuity correction
## 
## data:  x and y 
## V = 136, p-value = 0.005191
## alternative hypothesis: true location shift is not equal to 0


# p-value = 0.04524 < 0.05，拒絕原假設，即認為兩次測定有顯著差異
wilcox.test(x, y, exact = FALSE)

## 
## 	Wilcoxon rank sum test with continuity correction
## 
## data:  x and y 
## W = 274.5, p-value = 0.04524
## alternative hypothesis: true location shift is not equal to 0


# p-value=0.3773 大於0.05，可以認為樣本是來自於正態分布的總體
shapiro.test(x)

## 
## 	Shapiro-Wilk normality test
## 
## data:  x 
## W = 0.9507, p-value = 0.3773

# p-value=0.6848 大於0.05，可以認為樣本是來自於正態分布的總體
shapiro.test(y)

## 
## 	Shapiro-Wilk normality test
## 
## data:  y 
## W = 0.9667, p-value = 0.6848

# p-value = 0.7772>0.05，且95 percent confidence interval:0.4515
# 2.8818，包含1，可認為方差相等
var.test(x, y)

## 
## 	F test to compare two variances
## 
## data:  x and y 
## F = 1.141, num df = 19, denom df = 19, p-value = 0.7772
## alternative hypothesis: true ratio of variances is not equal to 1 
## 95 percent confidence interval:
##  0.4515 2.8818 
## sample estimates:
## ratio of variances 
##              1.141

# x,y均來自正態分布總體，且方差齊性，可以做t-檢驗 p-value =
# 0.03085<0.05，拒絕原假設，認為兩次測定有顯著差異
t.test(x, y)

## 
## 	Welch Two Sample t-test
## 
## data:  x and y 
## t = 2.243, df = 37.84, p-value = 0.03085
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##   0.8115 15.8785 
## sample estimates:
## mean of x mean of y 
##     33.22     24.87


# 根據以上分析，符號檢驗法的效果較差

5.17調查某大學學生每周學習時間與得分的平均等級之間的關系，現抽查10個學生的資料如表：學習時間：24,17, 20 ,41, 52, 23, 46, 18, 15, 29. 學習等級：8 ,1 ,4 ,7, 9 ,5 ,10, 3 ,2 ,6. 其中等級10表示最好，1表示最差，試用秩相關檢驗（Spearman檢驗和Kendall檢驗）分析學習等級與學習成績有無關系。

x <- c(24, 17, 20, 41, 52, 23, 46, 18, 15, 29)
y <- c(8, 1, 4, 7, 9, 5, 10, 3, 2, 6)
# p-value < 2.2e-16<0.05，拒絕原假設，即認為兩變量相關, 同時rho
# 0.9394，表示兩變量正相關
cor.test(x, y, method = "spearman")

## 
## 	Spearman's rank correlation rho
## 
## data:  x and y 
## S = 10, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0 
## sample estimates:
##    rho 
## 0.9394


# p-value = 0.0003577<0.05，拒絕原假設，即認為兩變量相關, 同時rho
# 0.8222，表示兩變量正相關
cor.test(x, y, method = "kendall")

## 
## 	Kendall's rank correlation tau
## 
## data:  x and y 
## T = 41, p-value = 0.0003577
## alternative hypothesis: true tau is not equal to 0 
## sample estimates:
##    tau 
## 0.8222


# 兩種方法檢驗結果一致

5.18為比較一種新療法對某種疾病的治療效果，將40名患者隨機地分為兩組，每組20人，一組采用新療法，另一組用原標准療法，經過一段時間的治療后，對每個患者的療效作仔細的評估，並划分為差，較差，一般，較好和好五個等級，兩組中處於不同等級的患者人數如表5.26所示，試分析，由此結果能否認為新方法的療效顯著地優於原療法（\(\alpha=0.05\)）

表5.26 不同方法治療后的效果
等級差較差一般較好好
新療法組 0 1 9 7 3
原療法組 2 2 11 4 1

# 對差到好進行編號為1：5
x <- rep(1:5, c(0, 1, 9, 7, 3))
y <- rep(1:5, c(2, 2, 11, 4, 1))
# p-value = 0.05509 >0.05,不能拒絕原假設，即認為兩種效果相同
wilcox.test(x, y, exact = FALSE)

## 
## 	Wilcoxon rank sum test with continuity correction
## 
## data:  x and y 
## W = 266, p-value = 0.05509
## alternative hypothesis: true location shift is not equal to 0

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 R語言與概率統計(二) 假設檢驗 R中統計假設檢驗總結(一) R-假設檢驗第八章- 假設檢驗概率統計22——假設檢驗理論（1）概率統計23——假設檢驗理論（2）『原創』統計建模與R軟件-第二章 R軟件的使用統計學習導論：基於R應用——第五章習題（九）假設檢驗四、假設檢驗

『原創』統計建模與R軟件-第五章 假設檢驗

統計建模與R軟件-第五章 假設檢驗