Fitmodel

"'Fitmodel' estimates the parameters of various codon-based models of substitution, including those described in Guindon, Rodrigo, Dyer and Huelsenbeck (2004).  These models are especially useful as they accommodate site-specific switches between selection regimes without a priori knowledge of the positions in the tree where changes of selection regimes occurred.

The program will ask for two input files: a tree file and a sequence file.  The tree should be unrooted and in NEWICK format.  The sequences should be in PHYLIP interleaved or sequential format.  If you are planning to use codon-based models, the sequence length should be a multiple of 3.  The program provides four types of codon models: M1, M2, M2a, and M3 (see PAML manual).  Moreover, M2, M2a and M3 can be combined with 'switching' models (option 'M').  Two switching models are implemented: S1 and S2.  S1 constraints the rates of changes between dN/dS values to be uniform (e.g., the rates of changes between negative and positive selection is constrained to be the same as the rate of change between neutrality and positive selection) while S2 allows for differents rates of change between the different classes of dN/dS values.

If you are using a 'switching' model, 'fitmodel' will output file with the following names: your_sequence_file_trees_w1, your_sequence_file_trees_w2, your_sequence_file_trees_w3 and your_sequence_file_trees_wbest.  The w1, w2 and w3 files give the estimated tree with probabilities of w1, w2, and w3 (three maximum likelihood dN/dS ratio estimates) calculated on each edge of the tree and for each site.  Hence, the first tree in one of these files reports the probabilities calculated at the first site of the alignment.  Instead of probabilities, the wbest file allows you to identify which of the tree dN/dS is the most probable on any give edge, at any given site.  A branch with label 0.0 means that w1 is the most probable class, 0.5 indicates the w2 is the most probable and 1.0 means that w3 has the highest posterior probability." (README.txt)

Availability & Restrictions

Fitmodel is available to all OSC users without restriction.

The following versions of fitmodel are available on OSC systems:

Version Glenn Oakley
0.5.3 X  

Usage

Set-up

On the Glenn Cluster fitmodel is accessed by executing the following commands:

module load biosoftw
module load fitmodel

Using fitmodel

fitmodel will be added to the users PATH and can then be run with the following command:

fitmodel -treefile treefilename -seqfile seqfilename [options]

Options

-type nt or aa (default=nt)
-freq empirical or ml or uniform or F3X4 (defaults=empirical or F3X4)
-codon no or yes (defaults=no)
-model JC69, K80, F81, HKY85, F84, TN93, GTR, Dayhoff, JTT, MtREV, WAG, DCMut, M2 or M3 (default=HKY85)
-pinvar [0.0;1.0]
-optpinvar no or yes (default=no)
-kappa [0.01;100.0]
-optkappa no or yes (default=no)
-ncatg integer > 0
-alpha [0.01;100.0]
-optalpha no or yes (default=no)
-code 1,2,3,4,5,6,9,10,11,12,13,14,15,16,21,22,23 (see NCBI Taxonomy webpage) (default=yes)
-p1 [0.0;1.0]
-p2 [0.0;1.0]
-p3 [0.0;1.0]
-w1 [1E-7;1E+7]
-w2 [1E-7;1E+7]
-w3 [1E-7;1E+7]
-switches no or S1 or S2 (default=no)
-optpw yes or no (default=yes)
-multiple integer > 0
-interleaved yes or no (default=yes)
-optall yes or no (default=yes)

Batch Usage

Modified PAML's example brown.trees & brown.nuc files to be in NEWICK & PHYLIP formats respectively.

#PBS -N fitmodel_test
#PBS -l walltime=00:05:00
#PBS -l nodes=1:ppn=4
#PBS -j oe

module load biosoftw
module load fitmodel
cd $PBS_O_WORKDIR
echo "y" | fitmodel -treefile brown.newick -seqfilename brown.phylip

Further Reading

Supercomputer: 
Service: