比對軟件 - 專題

本文轉載自查看原文 2016-12-20 15:29 4989 比對

An illustration of relationships between alignment methods.

The applications / corresponding computational restrictions shown are (green) short pairwise alignment / detailed edit model;

(yellow) database search / divergent homology detection;

(red) whole genome alignment / alignment of long sequences with structural rearrangements;

and (blue) short read mapping / rapid alignment of massive numbers of short sequences. Although solely illustrative, methods with more similar data structures or algorithmic approaches are on closer branches.

The BLASR method combines data structures from short read alignment with optimization methods from whole genome alignment.

用過的比對軟件不多，只知道簡單的全局比對和局部比對算法，比對軟件的原理基本是不知道的。

現在用過的比對軟件：bwa、bowtie、blasr、SHRiMP、DALIGNER、MHAP、blast、blat、SOAP、Subread、NovoAlign、Maq

還有：MEGABLAST、Mummer、GMAP、STAR、DIAMOND、ELAND、RMAP、ZOOM、SeqMap、CloudBurst

慢慢積累，比較這些軟件的不同，因為生物信息最底層的就是比對，測序拿到一堆序列，第一件要做得事情就是比對。

先看一篇好文：Aligner tutorial: GMAP, STAR, BLAT, and BLASR

常用的核酸序列比對到底有哪幾種？

二代短reads比對到genome

三代長reads比對到genome

剪切體比對

二代reads與三代reads比

genome之間比

多序列比對

數據庫比對

BWA

Burrows-Wheeler Aligner

適用范圍：二代測序數據快速比對到genome上

bwa作為序列比對界的模式軟件，短小精悍，適用於多種場合，很有必要搞懂他內部的比對算法，最好也搞懂它是如何實現的。

Fast and accurate short read alignment with Burrows–Wheeler transform - 2009 在線pdf 原文

lh3/bwa – Github Burrow-Wheeler Aligner for pairwise alignment between DNA sequences

BWA-backtrack：illumina reads比對，最長支持100bp（aln/samse/sampe）
BWA-SW：long-read比對，長度為70bp-1Mbp；支持剪切性比對（bwasw）
BWA-MEM：最新，最常用，同SW，但更准更快，與backtrack相比在70-100bp更具性能優勢（mem）

BWA方面主要有三篇學術論文：

Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]. (if you use the BWA-backtrack algorithm)
Li H. and Durbin R. (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26, 589-595. [PMID: 20080505]. (if you use the BWA-SW algorithm)
Li H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v2 [q-bio.GN]. (if you use the BWA-MEM algorithm or the fastmap command, or want to cite the whole BWA package)

BWA的設計思想

新一代測序技術中的短序列比對和組裝算法 - 碩士論文

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.15-r1140
Contact: Heng Li <lh3@sanger.ac.uk>

Usage:   bwa <command> [options]

Command: index         index sequences in the FASTA format
         mem           BWA-MEM algorithm
         fastmap       identify super-maximal exact matches
         pemerge       merge overlapping paired ends (EXPERIMENTAL)
         aln           gapped/ungapped alignment
         samse         generate alignment (single ended)
         sampe         generate alignment (paired ended)
         bwasw         BWA-SW for long queries

         shm           manage indices in shared memory
         fa2pac        convert FASTA to PAC format
         pac2bwt       generate BWT from PAC
         pac2bwtgen    alternative algorithm for generating BWT
         bwtupdate     update .bwt to the new format
         bwt2sa        generate SA from BWT and Occ

Note: To use BWA, you need to first index the genome with `bwa index'.
      There are three alignment algorithms in BWA: `mem', `bwasw', and
      `aln/samse/sampe'. If you are not sure which to use, try `bwa mem'
      first. Please `man ./bwa.1' for the manual.

實用算法實現-第8篇后綴樹和后綴數組 [1簡介]

bwa mem

bwa現在大家基本只用其mem比對算法了

還是單獨開一片筆記吧

SOAPaligner/soap2

soap2 - 官方

SOAP系列的沒有公布源碼，都是二進制執行程序，所以免除了安裝，同bwa一樣，也是要先建索引再比對

SOAP不是很吃內存，把人的3G的基因組讀到內存大概也就需要7G的內存，后面的比對都是不耗內存的。

./2bwt-builder ~/human_genome.fa
./soap –a <reads_a> -D <index.files> -o <output></output>
./soap –a <reads_a> -b <reads_b> -D <index.files> -o <PE_output> -2 <SE_output> -m <min_insert_size> -x <max_insert_size>

之前對SOAP一點印象都沒有，但是不少同事都在用SOAP系列的軟件。

主要是看了一個PPT，SOAP是有其比對上的優勢的

可以看出，SOAP對錯誤率的容忍較高，對indel的容忍也很好，這就是我現在需要的，可以嘗試一下用SOAP將二代比對到三代上。Mapping.ppt

BLASR

Basic Local Alignment with Successive Refinement

Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory - BMC Bioinformatics

待續~

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 BWA/BWT 比對軟件 muscle 軟件進行多序列比對生物信息bowtie比對軟件的結果格式說明比對軟件之STAR的使用方法比對軟件Blast，Blast+，Diamond比較 Diamond軟件比對蛋白質數據庫 RNA-Seq比對軟件HISAT2的用法 bwa比對軟件的使用以及其結果文件（sam）格式說明 11、比對軟件STAR（https://github.com/alexdobin/STAR） Java進階專題(二) 軟件架構設計原則（上）