GWAS學習筆記——imputation的含義 (Truth of Imputation)
Do not imput onther's sin to yourself
scheme of imputation in statistics
GWAS學習筆記——imputation的含義 (Truth of Imputation)
by Baoyu
7月份參加了Fudan主辦的首期GWAS研修班,對國內外GWAS有了系統、深入的感知。學習過程中,需要精准地理解一些重要概念,才能避免一知半解。
GWAS相關重要的名詞有Effect heterogeneity,Pooling,ReplicationJointPooled analysis,Geographic variation,Hierarchical Clustering,imputation,marginal effects,Manhattan Forest,GWAS consortium (如Genetic Association Information Network, GAIN)。這次班上不少老師把Manhattan Forest改稱為Pudong Forest,Pudong樓群的高度和密度都不亞於Manhattan),呵呵。
GWAS研究中,imputation前承高通量測序,下啟數據分析;對缺失數據的imputation是進行數據分析的前提,重要性不言而喻。
Impute願意是歸罪,歸咎,歸因,非難,詆毀。Merian-Webster辭典的解釋既是此義:
Main Entry: impute
Function: transitive verb
Inflected Form:imputed ; imputing
Etymology: Middle English inputen, from Latin imputare, from in- + putare to consider Date: 14th century
1: to lay the responsibility or blame for often falsely or unjustly
2: to credit to a person or a cause: ATTRIBUTE *our vices as well as our virtues have been imputed to bodily derangement B. N. Cardozo
synonyms see ASCRIBE.
統計遺傳學中意為預測、插補,由已知的基因型預測未知的基因型並對缺失的數據進行補缺,如這句:
This imputation method uses the dense genotype data available from the HapMap CEU samples and the linkage disequilibrium (LD) relationships of the SNPs to impute (predict) genotypes for a large number of SNPs that were not measured experimentally in our Finnish cases and controls.
Statistical genetics中imputation 的三個主要作用:
Allows testing of untyped variation ,
Allows easy combination of data across genotyping platforms ,
Provides complete data for analysis with multiple SNPs.
實現imputation的常用軟件:
1. IMPUTE
Developed by Jonathan Marchini
Nature Genetics, Advance online publication
http://www.stats.ox.ac.uk/~marchini/#software
2. Mach 1.0, Markov Chain Haplotyping
Developed by Goncalo Abecasis
http://www.sph.umich.edu/csg/abecasis/MACH/
附件1是U Michigen 的小牛Scott對imputation等分析方法的一個介紹。
附件2是Eric E Schadt (Rosetta Inpharmatics)實驗室最近在BMC Genetis上一篇題為GWAS中插補(imputation)准確度及對關聯分析統計效力的影響,值得一讀。
附件1 Scott_Handling and analyzing data of GWAS
附件2 09 Accuracy of genome-wide imputation of untyped markers and...