Commit ac87a674 authored by Aaron Petkau's avatar Aaron Petkau

Instructions and files for phyml

parent 06af939a
Galaxy Core Phylogenomics Pipeline
This contains the Galaxy tool definitions and workflow definitions needed to install the [Core Phylogenomis Pipeline][] into Galaxy.
This contains the Galaxy tool definitions and workflow definitions needed to install the [Core Phylogenomis Pipeline][] into Galaxy. This repository contains two main sections. A set of tools under `tools/` and a workflow implementing the core phylogenomics pipeline under `workflows/`. These can be packaged up and uploaded into a [Galaxy Tool Shed][] and then later installed to an instance of Galaxy. Instructions on how to install your own local Galaxy Tool Shed and Galaxy can be found at [IRIDA Galaxy Setup][].
Philip Mabon, Aaron Petkau
Installing the Tools
This repository contains two main sections. A set of tools under `tools/` and a workflow implementing the core phylogenomics pipeline under `workflows/`. These can be packaged up and uploaded into a [Galaxy Tool Shed][] and then later installed to an instance of Galaxy. Instructions on how to install your own local Galaxy Tool Shed and Galaxy can be found at [IRIDA Galaxy Setup][].
The tools under the `tools/` directory can be installed to a Galaxy Tool Shed and then to Galaxy using the following steps.
Step 1: Building Tool Shed Packages
......@@ -53,7 +53,31 @@ Once you have uploaded the packages to a Galaxy Tool Shed, you can install to a
3. Click on the arrow next to the tool and click on **Preview and install**.
4. Wait for Galaxy to install your tool.
Step 5: Test out your tool in Galaxy
Step 5: Install Additional Dependencies
The following additional dependency packages included in this repository need to be installed to the Galaxy Tool Shed, and then into Galaxy. These are located under `tools/`. The full list of dependencies is:
* phyml
In order to install the dependencies, use the following steps.
1. Build dependencies using ``.
2. Log into a Galaxy Tool Shed and create repositories for each dependency.
3. Upload tarball packages for each dependency.
4. Log into Galaxy and install dependency packages.
Step 6: Install Dependency Packages to Galaxy
Some dependency packages need to be installed separately to Galaxy. This can be accomplished with the following steps from within your running Galaxy instance.
1. Go to **Admin** and then **Search and browse tool sheds**.
2. Find the **Galaxy main tool shed** and click on **Browse valid repositories**.
3. Do a search for `msa_datatypes`. This should give a package with a collection of datatypes for multiple sequence alignments.
4. Install this package to Galaxy. The particular data type we are looking for is the `phylip` datatype.
Step 7: Test out your tool in Galaxy
Once you've finished installing your tool, you should be able to test it out within Galaxy. This can be automated by running the functional tests using the commands. This is adapted from the [Testing Installed Tools][] document.
......@@ -66,6 +90,29 @@ $ sh -installed
This should generate a report in the file `run_functional_tests.html`.
Installing the Workflow
Once the tools are installed the workflow under `workflows/` can be installed. This can be accomplished using the following steps.
Step 1: Upload Workflow to Tool Shed
The workflow can be uploaded to a Galaxy Tool Shed using the following commands.
1. Run the script ``. This will generate a file `build/core_phylogenomics_pipeline_workflow.tar.gz` containing the workflow.
2. In the Galaxy Tool Shed, create a new repository to contain your workflow.
3. From the button at the top right that says **Upload files to repository** please upload the file containing the workflow `build/core_phylogenomics_pipeline_workflow.tar.gz`.
Step 2: Install Workflow from Tool Shed to Galaxy
To install a workflow from the Tool Shed into a running Galaxy instance please use the following steps.
1. From the Galaxy instance go to **Admin** and then to **Search and browse tool sheds**.
2. Find the particular tool shed containing your workflow and then find the workflow repository you just uploaded.
3. Install this workflow into Galaxy.
[Core Phylogenomis Pipeline]:
[Galaxy Tool Shed]:
[Testing Installed Tools]:
......@@ -14,6 +14,8 @@ then
tar -C $TOOLS_DIR/core_phylogenomics_pipeline -czf $BUILD_DIR/core_phylogenomics_pipeline.tar.gz .
tar -C $TOOLS_DIR/phyml -czf $BUILD_DIR/phyml.tar.gz .
tar -C $WORKFLOWS_DIR/core_phylogenomics_pipeline_workflow -czf $BUILD_DIR/core_phylogenomics_pipeline_workflow.tar.gz .
echo "Successfully built tarballs"
shift 3
directory=`dirname $0`
phyml_3.1 -i $input_tree $*;
mv ${input_tree}${tree_suffix} ${output_tree};
mv ${input_tree}${stat_suffix} ${output_stat};
<tool id="phyml1" name="PhyML" version="3.1">
<requirement type="package" version="3.1">phyml</requirement>
<description>, a ML tree builder</description>
<command interpreter="bash">./ $input $output_tree $output_stats -d $datatype_condition.type -m $datatype_condition.model -v $prop_invar -s $search
#if $datatype_condition.type == "nt":
-t $datatype_condition.tstv
#end if
#if $gamma_condition.gamma == "yes":
-c $gamma_condition.categories -a $gamma_condition.shape
-c 1
#end if
#if $ == "sh":
-b -4
#end if
#if $ == "boot":
-b $support_condition.boot_number
#end if
#if $ == "no":
-b 0
#end if
#if $random_condition.random == "yes":
--rand_start 0 --n_rand_starts $random_condition.points
#end if
<param format="phylip" name="input" type="data" label="Alignment in phylip format" help="Alignment in phylip format"/>
<conditional name="datatype_condition">
<param type="select" name="type" label="Data type">
<option value="nt">Nucleic acids</option>
<option value="aa">Amino acids</option>
<when value="nt">
<param type="select" name="model" label="Evolution model">
<option value="HKY85">HKY85</option>
<option value="JC69">JC69</option>
<option value="K80">K80</option>
<option value="F81">F81</option>
<option value="F84">F84</option>
<option value="TN93">TN93</option>
<option value="GTR">GTR</option>
<param type="text" name="tstv" help="Must be a positive integer, 'e' if you want PhyML to estimate it" value="e" label="Transition/transversion ratio" />
<when value="aa">
<param type="select" name="model" label="Evolution model">
<option value="LG">LG</option>
<option value="WAG">WAG</option>
<option value="JTT">JTT</option>
<option value="MtREV">MtREV</option>
<option value="Dayhoff">Dayhoff</option>
<option value="DCMut">DCMut</option>
<option value="RtREV">RtREV</option>
<option value="CpREV">CpREV</option>
<option value="VT">VT</option>
<option value="Blosum62">Blosum62</option>
<option value="MtMam">MtMam</option>
<option value="MtArt">MtArt</option>
<option value="HIVw">HIVw</option>
<option value="HIVb">HIVb</option>
<conditional name="gamma_condition">
<param type="select" name="gamma" label="Discrete gamma model">
<option value="yes">Use a gamma model</option>
<option value="no">Don't use a gamma model</option>
<when value="yes">
<param type="text" name="categories" help="1 significates no gamma model" value="4" label="Number of categories for the discrete gamma model" />
<param type="text" name="shape" help="'e' if you want PhyML to estimate it" value="e" label="Shape parameter of the gamma model" />
<when value="no">
<conditional name="support_condition">
<param type="select" name="support" label="Branch support">
<option value="sh">SH-like aLRT</option>
<option value="boot">Bootstrap</option>
<option value="no">No branch support</option>
<when value="sh">
<when value="boot">
<param type="text" name="boot_number" help="Must be a positive integer" value="100" label="Number of bootstrap replicate" />
<when value="no">
<param type="text" name="prop_invar" help="0.0 to ignore this parameter, 'e' if you want PhyML to estimate it" value="0.0" label="Proportion of invariant sites" />
<param type="select" name="search" label="Tree topology search operation">
<option value="NNI">NNI (Nearest Neighbor Interchange)</option>
<option value="SPR">SPR (Subtree Pruning and Regraphing)</option>
<option value="BEST">Best of NNI and SPR</option>
<conditional name="random_condition">
<param type="select" name="random" label="Random starting points">
<option value="no">Don't add random starting points</option>
<option value="yes">Add random starting points</option>
<when value="yes">
<param type="text" name="points" help="A number greater than 0" value="4" label="Number of random starting points" />
<when value="no">
<data format="txt" name="output_tree" label="Newick Tree"/>
<data format="txt" name="output_stats" label="Phyml statistics output"/>
.. class:: infomark
**Program encapsulated in Galaxy by Southgreen**
.. class:: infomark
**PhyML version 3.0, 2010**
Please cite:
"New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0", **Guindon S., Dufayard JF., Anisimova M., Hordijk W., Lefort V., Gascuel O.**, Systematic Biology 2010 59(3):307-321; doi:10.1093/sysbio/syq010.
"A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.", **Guindon S., Gascuel O.**, Systematic Biology, 52(5):696-704, 2003.
PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm to perform Nearest Neighbor Interchanges (NNIs), in order to improve a reasonable starting tree topology. Since the original publication (Guindon and Gascuel 2003), PhyML has been widely used (~4,000 citations in ISI Web of Science), due to its simplicity and a fair accuracy/speed compromise. In the mean time research around PhyML has continued.
We designed an efficient algorithm to search the tree space using Subtree Pruning and Regrafting (SPR) topological moves (Hordijk and Gascuel 2005), and proposed a fast branch test based on an approximate likelihood ratio test (Anisimova and Gascuel 2006). However, these novelties were not included in the official version of PhyML, and we found that improvements were still needed in order to make them effective in some practical cases. PhyML 3.0 achieves this task.
It implements new algorithms to search the space of tree topologies with user-defined intensity. A non-parametric, Shimodaira-Hasegawa-like branch test is also available. The program provides a number of new evolutionary models and its interface was entirely re-designed. We tested PhyML 3.0 on a large collection of real data sets to ensure that the new version is stable, ready-to-use and still reasonably fast and accurate.
For further informations, please visite the PhyML_ website.
.. _PhyML:
<?xml version="1.0"?>
<package name="phyml" version="3.1">
<install version="1.0">
<action type="download_by_url"></action>
<action type="shell_command">mv PhyML-3.1_linux64 phyml_3.1</action>
<action type="move_file">
<action type="chmod">
<file mode="750">$INSTALL_DIR/bin/phyml_3.1</file>
<action type="set_environment">
<environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable>
FreeBayes requires g++ and the standard C and C++ development libraries.
Additionally, cmake is required for building the BamTools API.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment