GWAS研究可利用的數據庫(20200424更新)


1、列表包括數據庫名稱、表型、是否能下載到基因型(genotype)、是否能下載到GWAS結果文件(P值、效應值、SNP位點)。目前收集到的有如下:

參考到這些數據庫的文獻:Genome-wide association study identifies 74 loci associated with educational attainment

 

2、The Japanese Genotype-phenotype Archive (JGA)  :該數據擁有個體水平的基因型和表型數據,需要申請,已經有人做過GWAS了,數據庫連接:https://www.ddbj.nig.ac.jp/jga/index-e.html

 

3、ExAC,不提供個體水平的genotype,但提供vcf、CNV、coverage等。表型只提供已經發表過的表型,比如二型糖尿病。

ExAC涉及的population和樣本數:

Population

Male Samples

Female Samples

Total

African/African American (AFR)

1,888

3,315

5,203

Latino (AMR)

2,254

3,535

5,789

East Asian (EAS)

2,016

2,311

4,327

Finnish (FIN)

2,084

1,223

3,307

Non-Finnish European (NFE)

18,740

14,630

33,370

South Asian (SAS)

6,387

1,869

8,256

Other (OTH)

275

179

454

Total

33,644

27,062

60,706

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ExAC可下載的數據:

FTP Link

Description

Sites VCF

VCF of Variant Sites

CNV

CNV Counts and Intolerance Scores

Coverage

Per Base Coverage

Functional Gene Constraint

Functional Gene Constraint Scores for ExAC and Subsets

Manuscript Data

Variant Tables Used in Manuscript

Resources

Exome Calling and Purcell5k Intervals

Subsets

Non-TCGA VCF Subset

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

數據庫鏈接:http://exac.broadinstitute.org/downloads

 

4、Simons Genome Diversity Project (SGDP)

提供279個樣本,涉及的群體有:美洲、非洲、東亞、南亞、西歐、大洋洲;提供vcf、Phased genotypes、STR、BAMS for Y-chromosomes

鏈接地址:http://reichdata.hms.harvard.edu/pub/datasets/sgdp/

 

5、CHINESE MILLIONOME DATABASE

網址:https://db.cngb.org/cmdb/

The Chinese Millionome Database(CMDB) is a unique large-scale Chinese genomics database produced by BGI and hosted in the National GeneBank. The CMDB delivers peridical and useful variation information and scientific insights derived from the analysis of millions of Chinese sequencing data. The results aim to promote genetic research and precision medicine actions in China.

The delivering information includes any of detected variants and the corresponding allele frequency, annotation, frequency comparison to the global populations from existing databases, etc.

 提供變異位點的頻率、注釋、和其他群體的頻率比較;

 

6 、UK biobank 

UKbiobank的GWAS summary數據:https://ctg.cncr.nl/documents/p1651/ukb2_sumstats.tar.gz

這個數據很大,下載請謹慎。

 

7、失眠、阿爾茲海默症、各種精神類疾病、智力等的summary數據庫

https://ctg.cncr.nl/software/summary_statistics

 

8、日本的公共數據庫National Bioscience Database Centre (NBDC) Human Database

https://humandbs.biosciencedbc.jp/

 

9、CVDKP Datasets

表型:人體測量、心血管疾病、心電圖、房顫、血脂、血糖、精神病

http://www.kp4cd.org/datasets/mi

 

10、CARDIoGRAMplusC4D Consortium

表型:冠狀動脈疾病、心血管疾病

http://www.cardiogramplusc4d.org/data-downloads/

 

11、diagram consortium

表型:T2D

http://diagram-consortium.org/downloads.html

 

12、GWAS公共數據以及代碼存儲

https://data.mendeley.com/research-data/

 

13、日本的GWAS summary數據

http://jenger.riken.jp/en/result


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM