BLAST

The BLAST programs are widely used tools for searching DNA and protein databases for sequence similarity to identify homologs to a query sequence. While often referred to as just "BLAST", this can really be thought of as a set of programs: blastp, blastn, blastx, tblastn, and tblastx.

Availability & Restrictions

Versions

The following versions of BLAST are available on OSC systems: 

Version Owens Pitzer Cardinal
2.4.0+ X    
2.8.0+   X  
2.8.1+ X    
2.10.0+ X* X*  
2.11.0+ X X  
2.13.0+ X X  
2.16.0     X*
* Current Default Version

You can use module spider blast to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work. The module name on Cardinal differs, use  module spider blast-plus to view available modules for Cardinal.

If you need to use blastx, you will need to load one of the C++ implimenations modules of blast (any version with a "+").

Access

BLAST is available to all OSC users. If you have any questions, please contact OSC Help.

Publisher/Vendor/Repository and License Type

National Institutes of Health, Open source

Usage

Set-up

To load BLAST, type the following into the command line:

module load blast

Then create a resource file .ncbirc, and put it under your home directory.

Using BLAST

The five flavors of BLAST mentioned above perform the following tasks:

  • blastp: compares an amino acid query sequence against a protein sequence database

  • blastn: compares a nucleotide query sequence against a nucleotide sequence database

  • blastx: compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database

  • tblastn: compares a protein query sequence against a nucleotide sequence database dynamically translated in all six reading frames (both strands).

  • tblastx: compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. (Due to the nature of tblastx, gapped alignments are not available with this option)

NCBI BLAST Database

Information on the NCBI BLAST database can be found here. https://www.osc.edu/resources/available_software/scientific_database_list/blast_database 

We provide local access to nt and refseq_protein databases. You can access the database by loading desired blast-database modules. If you need other databases, please send a request email to OSC Help .

Batch Usage

A sample batch script on Owens and Pitzer is below:

#!/bin/bash
## --ntasks-per-node can be increased upto 48 on Pitzer
#SBATCH --nodes=1 --ntasks-per-node=28 
#SBATCH --time=00:10:00
#SBATCH --job-name Blast
#SBATCH --account=<project-account>

module load blast
module load blast-database/2018-08

cp 100.fasta $TMPDIR
cd $TMPDIR

tblastn -db nt -query 100.fasta -num_threads 16 -out 100_tblastn.out

cp 100_tblastn.out $SLURM_SUBMIT_DIR

Further Reading

Supercomputer: 
Service: 
Fields of Science: