The BLAST programs are widely used tools for searching DNA and protein databases for sequence similarity to identify homologs to a query sequence. While often referred to as just "BLAST", this can really be thought of as a set of programs: blastp, blastn, blastx, tblastn, and tblastx.
Availability & Restrictions
Versions
The following versions of BLAST are available on OSC systems:
Version | Owens | Pitzer | Cardinal |
---|---|---|---|
2.4.0+ | X | ||
2.8.0+ | X | ||
2.8.1+ | X | ||
2.10.0+ | X* | X* | |
2.11.0+ | X | X | |
2.13.0+ | X | X | |
2.16.0 | X* |
You can use module spider blast
to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work. The module name on Cardinal differs, use module spider blast-plus
to view available modules for Cardinal.
If you need to use blastx, you will need to load one of the C++ implimenations modules of blast (any version with a "+").
Access
BLAST is available to all OSC users. If you have any questions, please contact OSC Help.
Publisher/Vendor/Repository and License Type
National Institutes of Health, Open source
Usage
Set-up
To load BLAST, type the following into the command line:
module load blast
Then create a resource file .ncbirc, and put it under your home directory.
Using BLAST
The five flavors of BLAST mentioned above perform the following tasks:
-
blastp: compares an amino acid query sequence against a protein sequence database
-
blastn: compares a nucleotide query sequence against a nucleotide sequence database
-
blastx: compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database
-
tblastn: compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).
-
tblastx: compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. (Due to the nature of tblastx, gapped alignments are not available with this option)
NCBI BLAST Database
Information on the NCBI BLAST database can be found here. https://www.osc.edu/resources/available_software/scientific_database_list/blast_database
We provide local access to nt and refseq_protein databases. You can access the database by loading desired blast-database modules. If you need other databases, please send a request email to OSC Help .
Batch Usage
A sample batch script on Owens and Pitzer is below:
#!/bin/bash ## --ntasks-per-node can be increased upto 48 on Pitzer #SBATCH --nodes=1 --ntasks-per-node=28 #SBATCH --time=00:10:00 #SBATCH --job-name Blast #SBATCH --account=<project-account> module load blast module load blast-database/2018-08 cp 100.fasta $TMPDIR cd $TMPDIR tblastn -db nt -query 100.fasta -num_threads 16 -out 100_tblastn.out cp 100_tblastn.out $SLURM_SUBMIT_DIR