BWA-Samtools-Bcftools SNP calling


SNP鑒定的標准流程有三個步驟:

一、mapping

   使用BWA MEM將每個樣本的測序數據比對到組裝的參考序列。建立索引和比對:

   $ bwa index ref.fa

   $ bwa mem –t 40 ref.fa read1.fq read2.fq >aln-pe.sam

二、轉換

   $ samtools view -bS aln-pe.sam > aln-pe.bam

   $ samtools sort -@ 40 aln-6.bam > aln-6.sorted.bam

                            $ samtools sort –o aln-pe-sorted.bam aln-pe.bam

   $ samtools index aln-pe.sorted.bam

三、SNP calling

   $ samtools mpileup -uf Trinity-longest.fasta aln-6.sorted.bam | bcftools call -Ou -mv > var.raw.bcf

  $ bcftools view var.raw.bcf | bcftools filter -e 'QUAL<30 || DP<20' > val.filter.vcf

  $ bcftools view val.filter.vcf | bcftools filter -i 'TYPE="snp"' > val.snp.vcf

                                        $ bcftools mpileup –Ou –f ref.fa aln-pe.sorted.bam | bcftools call –Ou –mv | bcftools filter -s LowQual -e ‘%QUAL<20 || DP<10’ > var.flt.vcf

https://wikis.utexas.edu/display/bioiteam/variant+calling+using+samtools

 The basic Command line

Suppose we have reference sequences in ref.fa, indexed by samtools faidx, and position sorted alignment files aln1.bam and aln2.bam, the following command lines call SNPs and short INDELs:

  1. samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf  
  2. bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf  

where the -D option sets the maximum read depth to call a SNP. SAMtools acquires sample information from the SM tag in the @RG header lines. One alignment file can contain multiple samples; reads from one sample can also be distributed in different alignment files. SAMtools will regroup the reads anyway. In addition, if no @RG lines are present, each alignment file is taken as one sample.

Since r865, it is possible to generate the consensus sequence with

  1. samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM