Greenhouse-Geisser；統計結果報告；效應力大小介紹

本文轉載自查看原文 2018-04-15 21:56 1929 零碎東西

轉自博客 http://blog.sina.com.cn/s/blog_4d25466d0101p47z.html

===============================================================================
Greenhouse-Geisser 一般在ANOVA的統計分析常用，在結果報告中我很困惑其df的報告。今天特意把這個問題弄個明白：
自由度是否報告校正后的，讓我很困惱，有網友說：即便校正也不需要報告校正后自由度，只報告原來非校正的。或者看看文獻里如何報告的。有位發表過腦電文章的在國內科研單位工作的網友說：是必須報告校正后的，也一般報告四舍五入的整數值即可。
如下是源自：https://statistics.laerd.com/statistical-guides/sphericity-statistical-guide-2.php

This is to counteract the fact that when the assumption of sphericity is violated, there is an increase in Type I errors due to the critical values in the F-table being too small. These corrections attempt to correct this bias.---校正的目的
epsilon (referred to as

）具體是指：如下的紅線框旁邊的（p大於0.05，所以這個是沒有違反球形檢驗）：

An epsilon of 1 (i.e., ε = 1) indicates that the condition of sphericity is exactly met. The further epsilon decreases below 1 (i.e., ε < 1), the greater the violation of sphericity. Therefore, you can think of epsilon as a statistic that describes the degree to which sphericity has been violated.

當ε = 1時，說明這個值就是滿足球形檢驗；但是當這個值越是小於1時，則越不滿足違反了球形檢驗。

Greenhouse-Geisser Correction

Greenhouse-Geisser Correction為了校正F-分布的自由度進行估計epsilon;比如在違反了球形檢驗，就可以使用該檢驗。自由度也要相應的變化：

The Greenhouse-Geisser procedure estimates epsilon (referred to as ) in order to correct the degrees of freedom of the F-distribution as has been mentioned previously, and shown below:

Using our prior example, and if sphericity had been violated, we would have:

So our F-test result is corrected from F (2,10) = 12.534, p = .002 to F (1.277,6.384) = 12.534, p= .009 (degrees of freedom are slightly different due to rounding). The correction has elicited a more accurate significance value. It has increased the p-value to compensate for the fact that the test is too liberal when sphericity is violated.注意這里的df有相應的變化。

Huynd-Feldt Correction

違反了球形檢驗，除了用上述的Greenhouse-Geisser,還可以使用Huynd-Feldt Correction

As with the Greenhouse-Geisser correction, the Huynd-Feldt correction estimates epsilon (represented as ) in order to correct the degrees of freedom of the F-distribution as shown below:

Using our prior example, and if sphericity had been violated, we would have:

So our F test result is corrected from F (2,10) = 12.534, p = .002 to F (1.520,7.602) = 12.534, p= .005 (degrees of freedom are slightly different due to rounding). As with the Greenhouse-Geisser correction, this correction has elicited a more accurate significance value; it has increased the p-value to compensate for the fact that the test is too liberal when sphericity is violated.

The Greenhouse-Geisser correction tends to underestimate epsilon (ε) when epsilon (ε) is close to 1 (i.e., it is a conservative correction), whilst the Huynd-Feldt correction tends to overestimate epsilon (ε) (i.e., it is a more liberal correction). Generally, the recommendation is to use the Greenhouse-Geisser correction, especially if estimated epsilon (ε) is less than 0.75. However, some statisticians recommend using the Huynd-Feldt correction if estimated epsilon (ε) is greater than 0.75. In practice, both corrections produce very similar corrections, so if estimated epsilon (ε) is greater than 0.75, you can equally justify using either.（相對來說：Greenhouse-Geisser更保守，Huynd-Feldt correction更自由。一般建議用Greenhouse-Geisser。但是，當estimated epsilon (ε)大於0.75時，就需使用Huynd-Feldt correction。在具體操作中，兩種校正是相似的，因此當estimated epsilon (ε)大於0.75時，兩種都可以用。）

另外，一篇文獻里這么提及：
http://www.uccs.edu/Documents/humanneurophysiologylab/07 kisley et al 2005 with erratum.pdf
All significance tests were two-tailed at the 0.05 level. To protect against Type I errors, the degrees of freedom for all repeated measures ANOVAs were adjusted by the method of Greenhouse and Geisser[53]. All waveform amplitudes,whether from positive- or negative-going waves, arereported here as absolute value.

All statistically significant effects were corrected using the Greenhouse–Geisser method (Greenhouse and Geisser, 1959 ){S.W. Greenhouse, S. Geisser--On methods in the analysis of profile data Psychometrika, 24 (1959), pp. 95–112}----一般ERP腦電分析部分，無論球形檢驗是否顯著，都會考慮用greenhouse-geisser校正。

========================================================================

如何報告結果，轉自網易博客的一篇文章：

http://bcaoyuan.blog.163.com/blog/static/210343052201342913053893/ 轉篇文章：

http://facelab.org/debruine/Teaching/Meth_A/files/Reporting_Statistics.pdf

一、一般原則：

不同的學科與雜志可能要求不一樣，但一般來說，可以參考一下格式報告：

1、小數點的保留：

a、大於100：報告整數（如：1034.963報告1035）

b、10-100：1位小數點（如：11.4378報告11.4）

c、0.10-10：2位小數點

d、0.001-0.10：3位小數點

e、小於0.001：報告到第一位非0的位數（注意4舍五入）

注意：

a、整數如人數不要加小數點。N=5不要寫成N=5.0；

b、p=.000時報告p<.001，其余均報告精確p值；一般默認雙尾，單尾需特殊說明；

c、省略0：如p值，r值以及偏eta-squared (ηp2)。

2、統計量縮寫（斜體字體，不寫縮寫不用斜體，在等號“=”前后需有空格）：

a、均數、標准差：（ M = 3.45, SD = 1.21）；

b、Mann-Whitney U檢驗：（U = 67.5, p = .034, r = .38）；

c、Wilcoxon signed-ranks檢驗：（Z = 4.21, p < .001）；

d、標准Z檢驗：（Z = 3.47, p = .001）；

e、t檢驗：（t(19) = 2.45, p = .031, d = 0.54）；

f、ANOVA：（F(2, 1279) = 6.15, p = .002, ηp2 = 0.010）；

g、Pearson相關：（r(1282) = .13, p < .001）。

注：推斷統計一般以以下方式報告：

“統計量(自由度) = , p = , effect size （統計量） = ”

二、結果報告：

1、描述統計：基本資料如：年齡等，報告用表或者文字，但不要兩者兼用。

范例：The average age of participants was 25.5 years (SD = 7.94).

The age of participants ranged from 18 to 70 years (M = 25.5, SD = 7.94). Age was non-normally distributed, with skewness of 1.87 (SE = 0.05) and kurtosis of 3.93 (SE = 0.10)

Participants were 98 men and 132 women aged 17 to 25 years (men: M = 19.2,SD = 2.32; women: M = 19.6, SD = 2.54).

2、非參檢驗：不要報告均數和標准差；在表里或文字報告中位數與全距；斜體U或Z，測量的effect size即r，(r = Z / √N)。

U檢驗（獨立樣本）范例：A Mann-Whitney test indicated that self-rated attractiveness was greater for women who were not using oral contraceptives (Mdn = 5) than for women who were using oral contraceptives (Mdn = 4), U = 67.5, p = .034, r = .38.

Z檢驗（相關樣本）范例：A Wilcoxon Signed-ranks test indicated that femininity was preferred more in female faces (Mdn = 0.85) than in male faces (Mdn = 0.65), Z = 4.21, p < .001, r = .76.

頻數Z檢驗（相關樣本）范例：A sign test indicated that femininity was preferred more in female faces than in male faces, Z = 3.47, p = .001.

3、t檢驗：報告統計量t，自由度，p與effect size即Cohen’s d。

單樣本范例：One-sample t-test indicated that femininity preferences were greater than the chance level of 3.5 for female faces (M = 4.50, SD = 0.70), t(30) = 8.01, p < .001, d = 1.44, but not for male faces (M = 3.46, SD = 0.73), t(30) = -0.32, p = .75, d = 0.057.

The number of masculine faces chosen out of 20 possible was compared to the chance value of 10 using a one-sample t-test. Masculine faces were chosen more often than chance, t(76) = 4.35, p = .004, d = 0.35.

4、相關樣本t檢驗：與獨立樣本t檢驗一樣。

A paired-samples t-test indicated that scores were significantly higher for the pathogen subscale (M = 26.4, SD = 7.41) than for the sexual subscale (M = 18.0, SD = 9.49), t(721) = 23.3, p < .001, d = 0.87.

Scores on the pathogen subscale (M = 26.4, SD = 7.41) were higher than scores on the sexual subscale (M = 18.0, SD = 9.49), t(721) = 23.3, p < .001, d = 0.87. A onetailed p-value is reported due to the strong prediction of this effect.

5、方差分析：需報告兩個自由度，先組間，后組內，中間用逗號和空格隔開，如：F(1, 237) = 3.45。

a、one-way ANOVAs與事后檢驗（post-hocs）：Analysis of variance showed a main effect of self-rated attractiveness (SRA) on preferences for femininity in female faces, F(2, 1279) = 6.15, p = .002, ηp2 = .010. Posthoc analyses using Tukey’s HSD indicated that femininity preferences were lower for participants with low SRA than for participants with average SRA (p = .014) and high SRA (p = .004), but femininity preferences did not differ significantly between participants with average and high SRA (p = .82).

b、2-way Factorial ANOVAs ：A 3x2 ANOVA with self-rated attractiveness (low, average, high) and oral contraceptive use (true, false) as between-subjects factors revealed a main effects of SRA, F(2, 1276) = 6.11, p = .002, ηp2 = .009, and oral contraceptive use, F(1, 1276) = 4.38, p = .037, ηp2 = 0.003. These main effects were not qualified by an interaction between

SRA and oral contraceptive use, F(2, 1276) = 0.43, p = .65, ηp2 = .001.

c、3-way ANOVAs與更多因素的方差分析：雖然一些書會讓我們報告所有的主效應與交互作用，即使結果不顯著，這樣可以簡化對於復雜實驗設計（如3因素或更多因素）結果的理解。報告所有的顯著效應和預測效應，即使結果不顯著。如果有多於兩個和你的主要假設無關的因素不顯著（如，你預測它們三者之間存在交互作用，但沒有任何主效應或兩因素的交互作用），你可以概括如下：A mixed-design ANOVA with sex of face (male, female) as a within-subjects factor and self-rated attractiveness (low, average, high) and oral contraceptive use (true, false) as between-subjects factors revealed a main effect of sex of face, F(1, 1276) = 1372, p < .001, ηp2 = .52. This was qualified by interactions between sex of face and SRA, F(2, 1276) = 6.90, p = .001, ηp2 = .011, and between sex of face and oral contraceptive use, F(1, 1276) = 5.02, p = .025, ηp2 = .004. The predicted interaction among sex of face, SRA and oral contraceptive use was not significant, F(2, 1276) = 0.06, p = .94, ηp2 < .001. All other main effects and interactions were non-significant and irrelevant to our hypotheses, all F ≤ 0.94, p ≥ .39, ηp2 ≤ .001.

注1：即和我們假設相關的，不管結果顯不顯著，均需詳細報告，其他結果可以概括報告。

注2：球形檢驗與矯正（Violations of Sphericity and Greenhouse-Geisser Corrections）：方差分析對違背球形檢驗是不能容忍的，但很容易矯正，若被試內水平多於2個時，檢查Mauchly’s test是否顯著，如果顯著，報告chi-squared (χ2)，自由度，p與epsilon (ε)；然后報告所有涉及此因素的Greenhouse-Geisser的校正值（保留適當的小數位數）。當被試內只有兩個水平時，chi-squared (χ2)為.000且沒有p值，不需要矯正。如：Data were analysed using a mixed-design ANOVA with a within-subjects factor of subscale (pathogen, sexual, moral) and a between-subject factor of sex (male, female). Mauchly’s test indicated that the assumption of sphericity had been violated (χ2(2) = 16.8, p < .001), therefore degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (ε = 0.98). Main effects of subscale, F(1.91, 1350.8) = 378, p < .001, ηp2 = .35, and sex, F(1, 709) = 78.8, p < .001, ηp2 = . 10, were qualified by an interaction between subscale and sex, F(1.91, 1351) = 30.4, p < .001, ηp2 = .041.

d、ANCOVA 協方差分析：An ANCOVA [between-subjects factor: sex (male, female); covariate: age] revealed no main effects of sex, F(1, 732) = 2.00, p = .16, ηp2 = .003, or age, F(1, 732) = 3.25, p = .072, ηp2 = .004, and no interaction between sex and age, F(1, 732) = 0.016, p = .90, ηp2 < .001.

The predicted main effect of sex was not significant, F(1, 732) = 2.00, p = .16, ηp2 = .003, nor was the predicted main effect of age, F(1, 732) = 3.25, p = .072, ηp2 = .004. The interaction between sex and age were also not significant, F(1, 732) = 0.016, p = .90, ηp2 < .001.

6、相關：Preferences for femininity in male and female faces were positively correlated, Pearson’s r(1282) = .13, p < .001.

參考文獻：

American Psychological Association. (2005). Concise Rules of APA Style. Washington, DC: APA Publications.

Field, A. P., & Hole, G. J. (2003). How to design and report experiments. London: Sage Publications.

===============================================================================================
http://mcgraw-hill.co.uk/openup/harris/b5.html

Size according to Cohen (1988)	Eta squared (% variance explained by your IV)	Cohen's d (in standard deviations)
Small	.01 (1%)	.2
Medium	.06 (6%)	.5
Large	.14 (14%)	.8

===============================================================================================

回歸模型中，effect size f 2 的計算：

SPSS匯總系列1：Greenhouse-Geisser；統計結果報告；效應力大小介紹

即等於拿第二層的R平方減去第一層的R平方，再除以1減第二層的R平方。

如果有三層的話，則是：拿第三層的R平方減去第二層的R平方，再除以1減第三層的R平方。

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, New Jersey: Lawrence Erlbaum Associates.

如上的效應力則為：f的平方=（0.22-0.062）/(1-0.22)

三層的線性回歸，第三層的效應力則為：f的平方=（0.267-0.22）/(1-0.267)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 jmeter用什么查看結果報告應力偏量第二不變量,等效應力和等效應變的表示方法 JMeter+Ant-自動發送測試結果報告郵件 JMeter-生成性能測試結果報告 Nmon服務器性能結果報告分析統計學_效應量Effect Size 運用釘釘機器人功能發送自動化結果報告學習筆記55—效應量和統計檢驗力 Python+unittest+HTMLTestRunner進行接口自動化，以html格式展示結果報告運用釘釘機器人功能發送自動化結果報告