7、sraToolkit安裝使用

本文轉載自查看原文 2017-12-04 15:57 5828 生信軟件安裝

參考：http://blog.csdn.net/Cs_mary/article/details/78378552 ###prefetch 參數解釋

https://www.ncbi.nlm.nih.gov/books/NBK158900/#SRA_download.how_do_i_use_the_sra_toolki ## convert data into a particular format (fastq-dump等）

https://github.com/ncbi/sra-tools/wiki/Downloads ###sra-tools軟件的下載，不同系統（Centos ubuntu window）

http://blog.csdn.net/xubo245/article/details/50513201 ###用Aspera connect從NCBI上下載SRA格式數據

https://indexofire.gitbooks.io/notebook_of_analyzing_pathogen_ngs_data/content/chapter_1/sra.html

http://boyun.sh.cn/bio/?p=1933

一. window

1.下載地址：

http://downloads.asperasoft.com/connect2/

2.下載：

數據下載地址：

http://www.ncbi.nlm.nih.gov/projects/faspftp/1000genomes/

其他地址：

http://www.1000genomes.org/aspera

二 linux

1、下載安裝

http://downloads.asperasoft.com/

curl -O http://download.asperasoft.com/download/sw/connect/3.6.1/aspera-connect-3.6.1.110647-linux-64.tar.gz

tar zxf asper-commect-3.6.1.110647-linux.tar.gz

sh aspera-connect-2.4.7.37118-linux-64.sh

2、##加入路徑

echo "alias acsp=/home/sxuan/.aspera/connect/bin/ascp" >> ~/.bashrc

3、下載地址查找：http://www.ncbi.nlm.nih.gov/Traces/study/

1）單個下載：ascp -i /your-path-to/.aspera/connect/etc/asperaweb_id_dsa.openssh anonftp@ftp-private.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/SRR/SRR689/SRR689250/SRR689250.sra ./

2）批量下載：整理成下面的格式黏貼在文本SRR_Download_List_file_list.txt 中：

/sra/sra-instant/reads/ByRun/sra/SRR/SRR689/SRR689250/SRR689250.sra

/sra/sra-instant/reads/ByRun/sra/SRR/SRR893/SRR893046/SRR893046.sra

nohup ascp -i /share/home/jialj/.aspera/connect/etc/asperaweb_id_dsa.putty --mode recv --host ftp-private.ncbi.nlm.nih.gov --user anonftp --file-list SRR_Download_List_file_list.txt ./ &

Aspera的用法： $ ascp [參數] 目標文件目的地址

Aspera的常用參數：

-T 不進行加密。若不添加此參數，可能會下載不了。

-i string 輸入私鑰，安裝 aspera 后有在目錄 ~/.aspera/connect/etc/ 下有幾個私鑰，使用 linux 服務器的時候一般使用 asperaweb_id_dsa.openssh 文件作為私鑰。

--host string ftp的host名，NCBI的為ftp-private.ncbi.nlm.nih.gov；EBI的為fasp.sra.ebi.ac.uk。

--user string 用戶名，NCBI的為anonftp，EBI的為era-fasp。

--mode string 選擇模式，上傳為 send，下載為 recv。

-l string 設置最大傳輸速度，比如設置為 200M 則表示最大傳輸速度為 200m/s。若不設置該參數，則一般可達到10m/s的速度，而設置了，傳輸速度可以更高。

三 prefetch直接調用ascp，沒有安裝ascp之前直接用http

-f | –force Force object download. One of: no, yes, all. no [default]: Skip download if the object if found and complete; yes: Download it even if it is found and is complete; all: Ignore lock files (stale locks or if it is currently being downloaded: use at your own risk!).
強制下載
默認：文件已經存在則跳過
yes: 即使已存在完整文件仍然下載

–transport Value one of: ascp (only), http (only), both (first try ascp, fallback to http). Default: both.
傳輸
默認：先嘗試ascp, 再嘗試http

-l | –list List the contents of a kart file.
列表kart文件
-s | –list-sizes List the content of kart file with target file sizes.
列表Kart文件及文件大小
-N | –min-size Minimum file size to download in KB (inclusive).
最小下載文件大小
-X | –max-size Maximum file size to download in KB (exclusive). Default: 20G.
最大下載文件大小
默認 20G
-o | –order Kart prefetch order. One of: kart (in kart order), size (by file size: smallest first). default: size.
Kart文件下載順序
默認：按文件大小順序下載
-a | –ascp-path

prefetch -a “/opt/aspera/bin/ascp|/opt/aspera/etc/asperaweb_id_dsa.openssh” SRR390728

When the toolkit is unable to locate an installed version of Aspera, the location of ascp and ssh key (-a /opt/aspera/bin/ascp|/opt/aspera/bin/asperaweb_id_dsa.openssh”) can be provided.
無法自動調用Aspera時就需要提供ascp的路徑和密鑰

prefetch -t ascp -a “/opt/aspera/bin/ascp|/opt/aspera/bin/asperaweb_id_dsa.openssh” --list SRR.file

prefetch -c SRR390728

This command will check the availability of all needed reference sequences (-c) for a given accession.
檢查給定序列號是否可以能夠下載

=====================================

批量下載SRRxxxxxx

# 如何下載多個文件？創建一個含有SRR runs的文件。

echo SRR1553608 > sra.ids

echo SRR1553605 >> sra.ids

# 用這個文件去prefetch對應的runs.

prefetch --option-file sra.ids

# 拆包下載好的所有文件。請注意下邊的做法不是特別妥當，因為（文件夾里）除了我們用sra.ids下載的，可能還有別的prefetch下來的文件。

fastq-dump --split-files ~/ncbi/public/sra/SRR15536*

--split-files: 　　By using this, one single SRR file will download as SRRxxx_1.fastq and SRRxxx_2.fastq.

--split-3:　　 which splits your SRR into 3 files: one for read 1, one for read 2, and one for any orphan reads (ie: reads that aren’t present in both files). This is important for downstream analysis, as some aligners require your paired reads to be in sync (ie: present in each file at the same line number) and orphan reads can throw this order off.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 SRAtoolkit軟件的使用介紹使用sratoolkit下載NCBI數據 sratoolkit-安裝過程中的問題 beego的安裝以及bee的安裝和使用 Gradle基本使用（1）：安裝、IDEA使用 pycharm安裝和首次使用使用Homebrew安裝MySQL Graphviz安裝及簡單使用 PHPExcel安裝及使用 gmp安裝及使用摘要