Sargan+Hansen：過度識別檢驗及Stata實現

本文轉載自查看原文 2021-07-14 11:20 2429 stata基礎

1.1 工具變量法

OLS 有一個經典的假設：解釋變量與隨機誤差項不相關，即。如果存在解釋變量違背了這個假設，則估計出的參數是有偏的，也是不一致的。

工具變量 (IV) 法為解決「內生解釋變量」問題提供了一種可行的方法。為此，我們需要找到滿足以下條件的「外生解釋變量 (z)」：

與內生解釋變量相關，即
與隨機誤差項不相關，即

根據「內生解釋變量」與「工具變量」間的數量關系，又可以分為以下幾種情況：

不可識別 (unidentified)：工具變量數小於內生解釋變量數；
恰好識別 (just or exactly indentified)：工具變量數等於內生解釋變量數；
過度識別 (overindentified)：工具變量數大於內生解釋變量數。

在「恰好識別」的情況下，我們可以估計，而在「過度識別」的情況下，則需要通過兩階段最小二乘法 (Two Stage Least Square，2SLS 或 TSLS) 估計。當然在「恰好識別」的情況下，我們也可以用 2SLS 進行估計。但是，在「不可識別」情況下，以上方法失效。2SLS 主要通過以下兩階段實現：

第一階段，用內生解釋變量對工具變量回歸；
第二階段，用被解釋變量對第一階段回歸的擬合值回歸。

值得注意， 2SLS 只有在「同方差」的情況下才是最優效率的，而在「過度識別」和「異方差」的情況下，廣義矩估計 (Generalized Method of Moments, GMM) 才是最有效率的。關於 GMM 介紹詳見：「Stata：GMM 簡介及實現范例」和「GMM 簡介與 Stata 實現」。

在使用工具變量之前，我們仍需進行若干檢驗：

解釋變量內生性的檢驗；
弱工具變量檢驗；
過度識別檢驗。

在「恰好識別」的情況下，我們無法檢驗工具變量的外生性，只能進行「定性討論或依賴專家意見」，詳見「IV-估計：工具變量不外生時也可以用！」。因此，我們重點關注「過度識別檢驗」的方法和在 Stata 中實現。

1.2 兩階段最小二乘法

兩階段最小二乘法其本質上是屬於工具變量，回歸分兩個階段進行，因此而得名。具體機理是：

第一步，將結構方程先轉換為簡化式模型（約簡型方程），簡化式模型里的每一個方程都不存在隨機解釋變量問題，可以直接采用普通最小二乘法進行估計。

第二步，由第一步得出的估計量替換 Y 。該方程中不存在隨機解釋變量問題，也可以直接用普通最小二乘法進行估計。

例子：一般 IV 回歸模型為：

以單內生回歸變量的 2SLS 為例，當只有一個內生回歸變量 X 和一些其他的包含的外生變量時，感興趣的方程為：

2. 過度識別檢驗

2.1過度識別檢驗原理

上面提到了，只有恰好識別和過度識別才能用 IV 方法估計。假設待估參數的個數為 k ，矩條件的個數為 l 。當 k=l 時，稱為“恰好識別”，當 k<l 時，稱為="過度識別"。

一個很重要的命題是：只有過度識別情況下才能檢驗工具變量的外生性，而恰好識別情況下無法檢驗。

為基於所有工具變量的 2SLS 回歸估計殘差（由於抽樣變異性因此是近似的而不是精確地，注意到這些殘差是利用 X 值而不是用其第一階段的預測值得到的。）

2.2.1 Sargan 檢驗

2.2.2 Hansen J 檢驗

2.2.3 C 統計量

3. 過度識別檢驗的 Stata 實現

3.1 ivreg2 命令

以官方 griliches76.dta 數據為例，lw 為工資對數，s 為受教育年限，expr 為工齡，tenure 為現單位工作年數，rns 為美國南方虛擬變量 (住在南方 = 1)，smsa 為大城市虛擬變量 (住在大城市 = 1)，iq 為智商，med 為母親受教育年限，kww 為一項職業測試成績 (score on knowledge in world of work test)，age 為年齡，mrt 為婚姻狀況 (已婚 = 1)。

在研究「智商」對「工資」的影響時，「智商」通常會被認為是一個內生的解釋變量，因此我們需要為「智商」尋找工具變量。當然外生解釋變量可以被看作自身的工具變量。在這里，我們將母親受教育年限 (med)、職業測試成績 (kww)、年齡 (age) 和婚姻狀況 (mrt）作為「智商」的工具變量，並進行「過度識別」檢驗。

在使用 ivreg2 命令進行工具變量回歸時，默認提供 Sargan 統計量，而在命令后加入 robust、bw、cluster 等選項時，Stata 默認提供 Hansen J 統計量。若要報告統計量，只需在命令后加入 orthog(varlist_ex) 選項，其中 varlist_ex 為需要檢驗外生性的變量。關於 ivreg2 更多介紹，詳見 help ivreg2。

*-安裝命令
  ssc install ivreg2, replace 
  
*-Sargan 檢驗
  use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta
  ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt)

*-Hansen J 檢驗
  ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt), robust
  
*-C 統計量
  ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt), orthog(s)

.   ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =      758
                                                      F( 12,   745) =    45.91
                                                      Prob > F      =   0.0000
Total (centered) SS     =  139.2861498                Centered R2   =   0.4255
Total (uncentered) SS   =  24652.24662                Uncentered R2 =   0.9968
Residual SS             =   80.0182337                Root MSE      =    .3249

------------------------------------------------------------------------------
          lw |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          iq |   .0001747   .0039035     0.04   0.964     -.007476    .0078253
           s |   .0691759   .0129366     5.35   0.000     .0438206    .0945312
        expr |    .029866   .0066393     4.50   0.000     .0168533    .0428788
      tenure |   .0432738   .0076271     5.67   0.000     .0283249    .0582226
         rns |  -.1035897    .029481    -3.51   0.000    -.1613715   -.0458079
        smsa |   .1351148   .0266573     5.07   0.000     .0828674    .1873623
             |
        year |
         67  |   -.052598   .0476924    -1.10   0.270    -.1460734    .0408774
         68  |   .0794686   .0447194     1.78   0.076    -.0081797    .1671169
         69  |   .2108962   .0439336     4.80   0.000     .1247878    .2970045
         70  |   .2386338   .0509733     4.68   0.000     .1387281    .3385396
         71  |   .2284609   .0437436     5.22   0.000     .1427251    .3141967
         73  |   .3258944   .0407181     8.00   0.000     .2460884    .4057004
             |
       _cons |    4.39955   .2685443    16.38   0.000     3.873213    4.925887
------------------------------------------------------------------------------
Underidentification test (Anderson canon. corr. LM statistic):          52.436
                                                   Chi-sq(4) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               13.786
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    16.85
                                         10% maximal IV relative bias    10.27
                                         20% maximal IV relative bias     6.71
                                         30% maximal IV relative bias     5.34
                                         10% maximal IV size             24.58
                                         15% maximal IV size             13.96
                                         20% maximal IV size             10.26
                                         25% maximal IV size              8.31
Source: Stock-Yogo (2005).  Reproduced by permission.
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):          87.655
                                                   Chi-sq(3) P-val =    0.0000
------------------------------------------------------------------------------
Instrumented:         iq
Included instruments: s expr tenure rns smsa 67.year 68.year 69.year 70.year
                      71.year 73.year
Excluded instruments: med kww age mrt
------------------------------------------------------------------------------

 *-Hansen J 檢驗
.   ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt), robust

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity

                                                      Number of obs =      758
                                                      F( 12,   745) =    46.94
                                                      Prob > F      =   0.0000
Total (centered) SS     =  139.2861498                Centered R2   =   0.4255
Total (uncentered) SS   =  24652.24662                Uncentered R2 =   0.9968
Residual SS             =   80.0182337                Root MSE      =    .3249

------------------------------------------------------------------------------
             |               Robust
          lw |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          iq |   .0001747   .0041241     0.04   0.966    -.0079085    .0082578
           s |   .0691759   .0132907     5.20   0.000     .0431266    .0952253
        expr |    .029866   .0066974     4.46   0.000     .0167394    .0429926
      tenure |   .0432738   .0073857     5.86   0.000     .0287981    .0577494
         rns |  -.1035897    .029748    -3.48   0.000    -.1618947   -.0452847
        smsa |   .1351148    .026333     5.13   0.000     .0835032    .1867265
             |
        year |
         67  |   -.052598   .0457261    -1.15   0.250    -.1422195    .0370235
         68  |   .0794686   .0428231     1.86   0.063    -.0044631    .1634003
         69  |   .2108962   .0408774     5.16   0.000     .1307779    .2910144
         70  |   .2386338   .0529825     4.50   0.000     .1347901    .3424776
         71  |   .2284609   .0426054     5.36   0.000     .1449558     .311966
         73  |   .3258944   .0405569     8.04   0.000     .2464044    .4053844
             |
       _cons |    4.39955    .290085    15.17   0.000     3.830994    4.968106
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):             41.537
                                                   Chi-sq(4) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               13.786
                         (Kleibergen-Paap rk Wald F statistic):         12.167
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    16.85
                                         10% maximal IV relative bias    10.27
                                         20% maximal IV relative bias     6.71
                                         30% maximal IV relative bias     5.34
                                         10% maximal IV size             24.58
                                         15% maximal IV size             13.96
                                         20% maximal IV size             10.26
                                         25% maximal IV size              8.31
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):        74.165
                                                   Chi-sq(3) P-val =    0.0000
------------------------------------------------------------------------------
Instrumented:         iq
Included instruments: s expr tenure rns smsa 67.year 68.year 69.year 70.year
                      71.year 73.year
Excluded instruments: med kww age mrt
------------------------------------------------------------------------------

. *-C 統計量
.   ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age mrt), orthog(age)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only

                                                      Number of obs =      758
                                                      F( 12,   745) =    45.91
                                                      Prob > F      =   0.0000
Total (centered) SS     =  139.2861498                Centered R2   =   0.4255
Total (uncentered) SS   =  24652.24662                Uncentered R2 =   0.9968
Residual SS             =   80.0182337                Root MSE      =    .3249

------------------------------------------------------------------------------
          lw |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          iq |   .0001747   .0039035     0.04   0.964     -.007476    .0078253
           s |   .0691759   .0129366     5.35   0.000     .0438206    .0945312
        expr |    .029866   .0066393     4.50   0.000     .0168533    .0428788
      tenure |   .0432738   .0076271     5.67   0.000     .0283249    .0582226
         rns |  -.1035897    .029481    -3.51   0.000    -.1613715   -.0458079
        smsa |   .1351148   .0266573     5.07   0.000     .0828674    .1873623
             |
        year |
         67  |   -.052598   .0476924    -1.10   0.270    -.1460734    .0408774
         68  |   .0794686   .0447194     1.78   0.076    -.0081797    .1671169
         69  |   .2108962   .0439336     4.80   0.000     .1247878    .2970045
         70  |   .2386338   .0509733     4.68   0.000     .1387281    .3385396
         71  |   .2284609   .0437436     5.22   0.000     .1427251    .3141967
         73  |   .3258944   .0407181     8.00   0.000     .2460884    .4057004
             |
       _cons |    4.39955   .2685443    16.38   0.000     3.873213    4.925887
------------------------------------------------------------------------------
Underidentification test (Anderson canon. corr. LM statistic):          52.436
                                                   Chi-sq(4) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               13.786
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    16.85
                                         10% maximal IV relative bias    10.27
                                         20% maximal IV relative bias     6.71
                                         30% maximal IV relative bias     5.34
                                         10% maximal IV size             24.58
                                         15% maximal IV size             13.96
                                         20% maximal IV size             10.26
                                         25% maximal IV size              8.31
Source: Stock-Yogo (2005).  Reproduced by permission.
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):          87.655
                                                   Chi-sq(3) P-val =    0.0000
-orthog- option:
Sargan statistic (eqn. excluding suspect orthogonality conditions):     47.413
                                                   Chi-sq(2) P-val =    0.0000
C statistic (exogeneity/orthogonality of suspect instruments):          40.242
                                                   Chi-sq(1) P-val =    0.0000
Instruments tested:   age
------------------------------------------------------------------------------
Instrumented:         iq
Included instruments: s expr tenure rns smsa 67.year 68.year 69.year 70.year
                      71.year 73.year
Excluded instruments: med kww age mrt
------------------------------------------------------------------------------

可以看出，無論是「Sargan 檢驗」還是「Hansen J」檢驗都拒絕了「原假設：所有工具變量都外生」，表明存在一部分內生的工具變量。進一步，我們又構造了統計量來檢驗工具變量 age 的外生性，檢驗結果顯著拒絕了「原假設：工具變量 age 是外生的」。

3.2GMM 中過度識別的命令為 `estat overid`

若是 Sargen - Baseman 檢驗的統計量對應的 p 值大於 0.05 ，則認為所有的工具變量都是外生的，也就是有效的，反之則是無效的。（原假設是所有工具變量是外省的，若是 p 值小於 0.05 ，則拒絕原假設）

sysuse auto,clear
ivregress gmm mpg gear_ratio (turn =weight length headroom),wmatrix(robust) small 
estat overid

過度識別檢驗（ Sargen - Baseman 檢驗）的結果為：

Test of overidentifying restriction: 
Hansen's J chi2(2) =  .54848 (p = 0.7601)

根據結果可知， Sargen - Baseman 檢驗統計量對應的 p 值大於 0.05 ，所有的工具變量都是外生有效的。

3.3 xtbond2 命令

以 mus08psidextract.dta 為例，該數據包含 595 名美國人 1976-1982 與工資相關的變量 (n = 595, T = 7)，其中 lwage 為工資對數，wks 為工作周數，ms 為婚否，union 為是否由工會合同確定工資，occ 為是否是藍領工人，south 為是否在美國南部，smsa 為是否住在大城市，ind 為是否在在制造業工作。

在使用 xtabond2 命令進行 GMM 回歸時，Stata 同時提供 Sargan 檢驗、Hansen J 檢驗、以及統計量。關於 Sargan 檢驗和 Hansen J 檢驗，一般認為 Hansen J 檢驗結果更為穩健。

*-安裝命令
  ssc install xtabond2, replace  
  
*-動態面板過度識別檢驗
  sysuse mus08psidextract.dta, clear
  xtabond2 lwage L(1/2).lwage L(0/1).wks ms union occ south smsa ind, \\\
           gmm(lwage, lag(2 4)) gmm(wks ms union, lag(2 3))  \\\ 
           iv(occ south smsa ind) nolevel twostep robust

上述命令中，L(1/2).lwage 表示 L1.wage L2.wage，L(0/1).wks 表示 wks L1.wks；gmm(lwage, lag(2 4)) 表示使用 lwage 的 2-4 階作為 GMM 式工具變量，gmm(wks ms union, lag(2 3)) 表示使用 wks ms union 的 2-3 階作為 GMM 式工具變量，iv(occ south smsa ind) 表示使用 occ south smsa ind 作為自身工具變量。

Dynamic panel-data estimation, two-step difference GMM
------------------------------------------------------------------------------
Group variable: id                              Number of obs      =      2380
Time variable : t                               Number of groups   =       595
Number of instruments = 39                      Obs per group: min =         4
Wald chi2(10) =   1287.77                                      avg =      4.00
Prob > chi2   =     0.000                                      max =         4
------------------------------------------------------------------------------
             |              Corrected
       lwage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lwage |
         L1. |    .611753   .0373491    16.38   0.000     .5385501    .6849559
         L2. |   .2409058   .0319939     7.53   0.000     .1781989    .3036127
             |
         wks |
         --. |  -.0159751   .0082523    -1.94   0.053    -.0321493     .000199
         L1. |   .0039944   .0027425     1.46   0.145    -.0013807    .0093695
             |
          ms |   .1859324    .144458     1.29   0.198       -.0972    .4690649
       union |  -.1531329   .1677842    -0.91   0.361    -.4819839    .1757181
         occ |  -.0357509   .0347705    -1.03   0.304    -.1038999     .032398
       south |  -.0250368   .2150806    -0.12   0.907     -.446587    .3965134
        smsa |  -.0848223   .0525243    -1.61   0.106     -.187768    .0181235
         ind |   .0227008   .0424207     0.54   0.593    -.0604422    .1058437
------------------------------------------------------------------------------
Instruments for first differences equation
  Standard
    D.(occ south smsa ind)
  GMM-type (missing=0, separate instruments for each period unless collapsed)
    L(2/3).(wks ms union)
    L(2/4).lwage
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z =  -4.52  Pr > z =  0.000
Arellano-Bond test for AR(2) in first differences: z =  -1.60  Pr > z =  0.109
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(29)   =  59.55  Prob > chi2 =  0.001
  (Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(29)   =  39.88  Prob > chi2 =  0.086
  (Robust, but weakened by many instruments.)

Difference-in-Hansen tests of exogeneity of instrument subsets:
  gmm(lwage, lag(2 4))
    Hansen test excluding group:     chi2(18)   =  23.59  Prob > chi2 =  0.169
    Difference (null H = exogenous): chi2(11)   =  16.29  Prob > chi2 =  0.131
  gmm(wks ms union, lag(2 3))
    Hansen test excluding group:     chi2(5)    =   6.43  Prob > chi2 =  0.266
    Difference (null H = exogenous): chi2(24)   =  33.44  Prob > chi2 =  0.095
  iv(occ south smsa ind)
    Hansen test excluding group:     chi2(25)   =  28.00  Prob > chi2 =  0.308
    Difference (null H = exogenous): chi2(4)    =  11.87  Prob > chi2 =  0.018

可以看出，工具變量總的個數為 39，而內生變量的個數為 10。同時，擾動項的差分存在一階自相關，而不存在二階自相關，故不能拒絕原假設「擾動項無自相關」，可以使用差分 GMM。

Sargan 統計量和 Hansen 統計量在 10% 的水平上都拒絕了「所有工具變量都外生」的原假設。值得注意的是，Sargan 統計量並不穩健，但不受工具變量過多的影響，而 Hansen 統計量雖然穩健，但受工具變量過多的影響。

4. 過度識別檢驗統計量無法計算

4.1 原因

在使用 ivreg2 命令進行估計時，我們經常會發現 Sargan 檢驗或 Hansen J 檢驗始終無法通過。這可能是由於「工具變量過多」造成的，如模型中控制了年份固定效應、地區固定效應和行業固定效應等虛擬變量。

但是，當「虛擬變量的個數 < (外生變量個數 + 工具變量個數)」時，正交條件對應的方差-協方差矩陣有可能是非滿秩矩陣，此時我們無法計算出矩陣的逆矩陣，從而導致過度識別檢驗的統計量無法計算。更為詳細的介紹，可通過 help ivreg2 查看。

4.2 解決方法

利用 Frisch-Waugh-Lovell (FWL) 定理，我們可以嘗試「partial out」一定數量的外生變量 (通常主要是虛擬變量)，以保證矩陣滿秩。在使用 ivreg2 命令執行 2SLS 或 GMM 估計時，我們可以加入 partial() 選項，選項中先填入所有外生的虛擬變量，如有必要，可以進一步加入其它外生的解釋變量。

4.3 Stata 實現

接下來，以案例形式簡要介紹「partial out」的原理。

范例 1：partial out 連續變量

sysuse "auto.dta", clear
    rename (price length weight) (Y X1 X2)
      
    ivregress 2sls Y X1 X2
    est store m0  //原始結果
      
    *-Partial out X2 
    ivregress 2sls Y X2      //從 y  中除去 X2 的影響
    predict e_y, res  
      
    ivregress 2sls X1 X2     //從 X1 中除去 X2 的影響
    predict e_x1, res  
      
    ivregress 2sls e_y e_x1  //partial out 后的的回歸結果    
    est store m1
       
    esttab m0 m1, nogap
    restore

 --------------------------------------------
                          (1)             (2)   
                            Y             e_y   
    --------------------------------------------
    X1                 -97.96*                  
                      (-2.55)                   
    X2                  4.699***                
                       (4.27)                   
    e_x1                               -97.96*  
                                      (-2.55)   
    _cons             10386.5*       2.04e-12   
                       (2.46)          (0.00)   
    --------------------------------------------
    N                      74              74   
    --------------------------------------------
    t statistics in parentheses
    * p<0.05, ** p<0.01, *** p<0.001          
    
    *-Notes: 
    (1) 如果采用 regress 進行回歸，SE 會有微小差異， 
        主要是因為 regress 會針對小樣本進行自由度調整。
    (2) 采用 IV/GMM 估計，即 ivregress 命令就不會有這個問題了。

范例 2：partial out 虛擬變量

sysuse auto, clear 
  drop if rep78==.
  global yx "price wei len mpg"

  ivregress 2sls $yx i.rep78
  est store m0
      
  bysort rep78: center $yx, inplace //prefix(c_)
  ivregress 2sls $yx
  est store m1  
      
  esttab m0 m1, nogap nobase
  restore

  --------------------------------------------
                            (1)             (2)   
                          price           price   
      --------------------------------------------
      weight              5.187***        5.187***
                         (4.74)          (4.74)   
      length             -124.2***       -124.2***
                        (-3.29)         (-3.29)   
      mpg                -126.8          -126.8   
                        (-1.60)         (-1.60)   
      2.rep78            1137.3                   
                         (0.67)                   
      3.rep78            1254.6                   
                         (0.80)                   
      4.rep78            2267.2                   
                         (1.42)                   
      5.rep78            3850.8*                  
                         (2.29)                   
      _cons             14614.5*      0.0000116   
                         (2.52)          (0.00)   
      --------------------------------------------
      N                      69              69   
      --------------------------------------------
      t statistics in parentheses
      * p<0.05, ** p<0.01, *** p<0.001        
    
     *-Note: FE 模型其實就是先 partial out 公司虛擬變量，然后再對轉換后的數據執行 OLS 回歸。

范例 3：是否加入 `partial()` 選項無影響

*-數據下載地址
* https://gitee.com/arlionn/data/blob/master/data01/Acem_data_done.dta

  use Acem_data_done.dta, clear
  global y "change_gdp"
  global x "change_dependency"
  global z "devo1990 lpop1990 base_dependency" // 控制變量  
  global IVs "birthrate1960_1965 birthrate1965_1970 birthrate1970_1975 birthrate1975_1980 birthrate1980_1985 birthrate1985_1990"// 工具變量
 
  ivregress 2sls $y ($x=$IVs) $z i.region_code, robust  //官方命令
    est store a1

    ivreg2 $y ($x=$IVs) $z i.region_code, robust    // 等價-外部命令
    est store a2   // without -partial()- option
      
  ivreg2 $y ($x=$IVs) $z i.region_code, robust ///
  partial(lpop1990 i.region_code base_dependency) 
  est store c5   // with -partial()- option      
      
*-手動計算：(特別注意：此時的 SE 是錯誤的！)
* Step1: 
    reg $x $IVs $z i.region_code 
    cap drop xhat
    predict xhat

* Step2:    
    reg $y xhat $z i.region_code
    est store a3
      
*-對比結果：
  local m "a1 a2 c5 a3"
  esttab `m' `s', nogap  replace        ///
  b(%6.3f) s(N r2_a rkf j jp)    ///
  star(* 0.1 ** 0.05 *** 0.01)   ///
  order(change_dependency xhat)  ///
  indicate("Region Dummies =*.region_code")  ///
  addnotes("*** 1% ** 5% * 10%") nobase

----------------------------------------------------------------------------
                   (a1)            (a2)             (c5)            (a3)   
                no-parital      no-partial       partial-out       by-hand   
CMD             ivregress         ivreg2          ivreg2          regress
----------------------------------------------------------------------------
change_dep~y        1.703***        1.703***        1.703***                
                   (4.14)          (4.14)          (4.14)                   
xhat                                                                1.703***
                                                                   (3.80)   
devo1990           -0.190***       -0.190***       -0.190***       -0.190***
                  (-4.22)         (-4.22)         (-4.22)         (-4.58)   
lpop1990           -0.017          -0.017                          -0.017   
                  (-0.83)         (-0.83)                         (-0.96)   
base_depen~y       -0.041          -0.041                          -0.041   
                  (-0.14)         (-0.14)                         (-0.13)   
_cons               1.899***        1.899***                        1.899***
                   (4.99)          (4.99)                          (5.54)   
Region Dum~s          Yes             Yes              No             Yes   
----------------------------------------------------------------------------
N                 169.000         169.000         169.000         169.000   
r2_a                0.179           0.179          -0.024           0.261   
rkf                                19.365          19.365                   
j                                   4.280           4.280                   
jp                                  0.510           0.510                   
----------------------------------------------------------------------------
t statistics in parentheses， p<0.1, ** p<0.05, *** p<0.01

Notes：(1) a1 與 c5 結果完全相同，因此 partial out 部分變量不影響系數估計值；(2) partial out 的目的是為了減少干擾項的方差協方差矩陣的維度，以便合理計算 Sargan 和 Hansen J 統計量。

范例4：是否加入 partial 選項有顯著影響

我們將通過下例來演示加入 partial 選項引起的變化。

use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear
ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age), cluster(year)

執行完上述命令后，會出現如下提示，即由於矩條件的協方差矩陣非滿秩，過度識別檢驗的結果無法顯示。在此情況下，可篩除一些虛擬變量。

Warning: estimated covariance matrix of moment conditions not of full rank.overidentification statistic not reported, and standard errors and model tests should be interpreted with caution.

Possible causes: number of clusters insufficient to calculate robust covariance matrix singleton dummy variable (dummy with one 1 and N-1 0s or vice versa).

partial option may address problem.

下面利用 partial() 選項篩除年份虛擬變量后回歸，即可呈現 Hansen J 的檢驗結果。

ivreg2 lw s expr tenure rns smsa i.year (iq=med kww age), cluster(year) partial(i.year)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 PSM的stata實現 stata檢驗查看重復值 stata學習筆記（七）：回歸分析和穩健性檢驗 stata工具變量法：使用2SLS進行ivreg2估計及其檢驗 Python實現 MK檢驗假設檢驗的python實現命令——Z檢驗、t檢驗、F檢驗 2d旋轉（css3實現過度效果和動畫效果） vue中過度動畫之列表添加刪除動畫實現 Vue，動畫-使用過度類名實現動畫（漸變）常用統計檢驗的Python實現