1. 下載安裝
直接去官網下載二進制軟件,解壓后的trimmomatic-0.36.jar即為我們需要的軟件
官網:
http://www.usadellab.org/cms/index.php?page=trimmomatic
wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.38.zip
unzip Trimmomatic-0.38.zip
wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.36.zip unzip Trimmomatic-0.36.zip
[Trimmomatic-0.38]# tree
.
├── adapters
│ ├── NexteraPE-PE.fa
│ ├── TruSeq2-PE.fa
│ ├── TruSeq2-SE.fa
│ ├── TruSeq3-PE-2.fa
│ ├── TruSeq3-PE.fa
│ └── TruSeq3-SE.fa
├── LICENSE
└── trimmomatic-0.38.jar
Question: Which truseq trimmomatic adapters file to use when removing truseq adapters?
ref:
.
├── adapters
│ ├── NexteraPE-PE.fa
│ ├── TruSeq2-PE.fa
│ ├── TruSeq2-SE.fa
│ ├── TruSeq3-PE-2.fa
│ ├── TruSeq3-PE.fa
│ └── TruSeq3-SE.fa
├── LICENSE
└── trimmomatic-0.38.jar
2. 運行軟件
一般我們使用默認參數運行即可,具體使用方法可參見官網http://www.usadellab.org/cms/?page=trimmomatic
使用默認參數運行程序:
sudo java -jar trimmomatic-0.36.jar PE \ -phred33 ~/SRR733/SRR2854733_1.fastq ~/SRR733/SRR2854733_2.fastq \ ~/SRR733/clsseq/SRR2854733_1_paired.fq ~/SRR733/clsseq/SRR2854733_1_unpaired.fq \ ~/SRR733/clsseq/SRR2854733_2_paired.fq ~/SRR733/clsseq/SRR2854733_2_unpaired.fq \ ILLUMINACLIP:/usr/local/src/Trimmomatic/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 \ LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 HEADCROP:8 MINLEN:36
運行結果:
Input Read Pairs: 23396043
Both Surviving: 20842668 (89.09%)
Forward Only Surviving: 2537100 (10.84%)
Reverse Only Surviving: 13969 (0.06%)
Dropped: 2306 (0.01%) TrimmomaticPE: Completed successfully
3. 常用參數說明
PE/SE
設定對Paired-End或Single-End的reads進行處理,其輸入和輸出參數稍有不一樣。
-threads
設置多線程運行數
-phred33
設置鹼基的質量格式,可選pred64
ILLUMINACLIP:TruSeq3-PE.fa:2:30:10
切除adapter序列。參數后面分別接adapter序列的fasta文件:允許的最大mismatch數:palindrome模式下匹配鹼基數閾值:simple模式下的匹配鹼基數閾值。
LEADING:3
切除首端鹼基質量小於3的鹼基
TRAILING:3
切除尾端鹼基質量小於3的鹼基
SLIDINGWINDOW:4:15
從5'端開始進行滑動,當滑動位點周圍一段序列(window)的平均鹼基低於閾值,則從該處進行切除。Windows的size是4個鹼基,其平均鹼基
質量小於15,則切除。
MINLEN:50
最小的reads長度
CROP:<length> 保留reads到指定的長度 HEADCROP:<length> 在reads的首端切除指定的長度 TOPHRED33 將鹼基質量轉換為pred33格式 TOPHRED64 將鹼基質量轉換為pred64格式
Question: Which truseq trimmomatic adapters file to use when removing truseq adapters?
It depends mostly on which TruSeq protocol was used (V2 - which is old at this stage and usually data from the GAII, or V3, which is everything from the HiSeq or later machines), and whether the data is single-ended or paired ended (SE or PE). The only exception is TruSeq-3-PE which has two sets - TruSeq-3-PE.fa works fine for high quality libraries, but TruSeq-3-PE-2.fa contains some additional sequences which find partial adapters in unusual location/orientation.
ref:
https://www.jianshu.com/p/7b5591673255
https://www.biostars.org/p/323087/