Commit a650faec authored by Philip Mabon's avatar Philip Mabon

Merge branch 'feature/core-pipeline-workflow' into 'development'

Feature/core pipeline workflow

Branch containing all tools + instructions for installing the core snp pipeline.
parents 06af939a 5ca985ab
build/*
workflows/core_phylogenomics_pipeline_workflow/Galaxy-Workflow-Core_SNP_Pipeline.ga
Galaxy Core Phylogenomics Pipeline
==================================
This contains the Galaxy tool definitions and workflow definitions needed to install the [Core Phylogenomis Pipeline][] into Galaxy.
This contains the Galaxy tool definitions and workflow definitions needed to install the [Core Phylogenomis Pipeline][] into Galaxy. This repository contains two main sections. A set of tools under `tools/` and a workflow implementing the core phylogenomics pipeline under `workflows/`. These can be packaged up and uploaded into a [Galaxy Tool Shed][] and then later installed to an instance of Galaxy. Instructions on how to install your own local Galaxy Tool Shed and Galaxy can be found at [IRIDA Galaxy Setup][].
Authors
=======
Philip Mabon, Aaron Petkau
Install
=======
Installing the Tools
====================
The Core Phylogenomics Pipeline makes use of a mixture of local (installed in a local Galaxy Tool Shed) tools as well as tools in the main [Galaxy Tool Shed][]. To install these dependency tools please follow the steps in both sections below.
Installing Local Tools
----------------------
The local tools are meant to be installed in a local Galaxy Tool Shed. These are located under the `tools/` directory and include the following:
* **phyml**
* **smalt_collection**
* **freebayes**
* **core-pipeline tools**
This repository contains two main sections. A set of tools under `tools/` and a workflow implementing the core phylogenomics pipeline under `workflows/`. These can be packaged up and uploaded into a [Galaxy Tool Shed][] and then later installed to an instance of Galaxy. Instructions on how to install your own local Galaxy Tool Shed and Galaxy can be found at [IRIDA Galaxy Setup][].
These can be installed to a Galaxy Tool Shed and then to Galaxy using the following steps.
Step 1: Building Tool Shed Packages
-----------------------------------
### Step 1: Building Tool Shed Packages
In order to build packages that can be uploaded to a Galaxy Tool Shed please run the following command.
......@@ -25,26 +36,23 @@ $ ./build_for_toolshed.sh
This will build a number of `.tar.gz` files within the `build/` directory that can then be uploaded into a Galaxy Tool Shed.
Step 2: Creating Repositories for Tool Shed Packages
----------------------------------------------------
### Step 2: Creating Repositories for Tool Shed Packages
In order to install tools into your own local instance of a Galaxy Tool Shed, you first need to create empty repositories. This can be accomplished by:
1. Log into your Galaxy Tool Shed. On the left panel please find and click on the **Create new repository** link.
2. Fill out the name of the repository, for example for `core_phylogenomics_pipeline.tar.gz` you can fill out **core_phylogenomics_pipeline**. Fill out any other information.
2. Fill out the name of the repository, for example for `core_phylogenomics_pipeline.tar.gz` you can fill out **core_phylogenomics_pipeline** (please make sure to name the repository the same name as the tarball minus `.tar.gz`). Fill out any other information.
3. Click on **Save**.
4. Repeat for any other files under `build/`.
Step 3: Upload Tool Shed Packages
---------------------------------
### Step 3: Upload Tool Shed Packages
1. Find and click on one of your new empty repositories.
2. In the upper right click on **Upload files to repostory**.
3. From here **Browse** to one of the tool shed packages under `build/` and upload this package.
4. In the upper right corner under **Repository Actions** click on **Reset all repository metadata**. You should now see a screen listing the tools and dependencies of this repository.
Step 4: Install Packages to Galaxy
----------------------------------
### Step 4: Install Packages to Galaxy
Once you have uploaded the packages to a Galaxy Tool Shed, you can install to a local version of Galaxy linked up to the Tool Shed by following the below steps.
......@@ -53,10 +61,28 @@ Once you have uploaded the packages to a Galaxy Tool Shed, you can install to a
3. Click on the arrow next to the tool and click on **Preview and install**.
4. Wait for Galaxy to install your tool.
Step 5: Test out your tool in Galaxy
------------------------------------
Installing External Tools
-------------------------
Once the above local dependency packages have been installed to the Tool Shed, we can begin to install the external dependencies into Galaxy. The list of packages that need to be installed includes.
* http://toolshed.g2.bx.psu.edu/repos/devteam/sam_to_bam/sam_to_bam/1.1.4
* http://toolshed.g2.bx.psu.edu/repos/devteam/samtools_mpileup/samtools_mpileup/0.0.3
* http://toolshed.g2.bx.psu.edu/repos/gregory-minevich/bcftools_view/bcftools_view/0.0.1
* http://toolshed.g2.bx.psu.edu/view/iuc/msa_datatypes
This can be accomplished with the following steps from within your running Galaxy instance.
Once you've finished installing your tool, you should be able to test it out within Galaxy. This can be automated by running the functional tests using the commands. This is adapted from the [Testing Installed Tools][] document.
1. Go to **Admin** and then **Search and browse tool sheds**.
2. Find the **Galaxy main tool shed** and click on **Browse valid repositories**.
3. Do a search for each one of the packages. This should give you a page to install the package.
4. Install the package into Galaxy.
5. Repeat for each of the packages listed above.
Testing Tools in Galaxy
-----------------------
Once you've finished installing both your local and external tools, you should be able to test it out within Galaxy. This can be automated by running the functional tests using the commands. This is adapted from the [Testing Installed Tools][] document.
```bash
$ export GALAXY_TOOL_DEPENDENCY_DIR=/path/to/tool-dependencies
......@@ -66,6 +92,61 @@ $ sh run_functional_tests.sh -installed
This should generate a report in the file `run_functional_tests.html`.
Installing the Workflow
=======================
Once the tools are installed the workflow under `workflows/` can be installed. This can be accomplished using the following.
Generating a Galaxy workflow file
---------------------------------
The core snp pipeline workflow is stored as a Galaxy workflow, which contains references to all tools used + tool sheds used to install these tools. For example freebayes is refered to as `galaxy-shed.corefacility.ca/repos/phil/freebayes/freebayes/0.0.4`. If you have installed any of the local tools in a differently named tool shed, than this full path will not work. To work around this issue, a template file is included for the workflow `workflows/core_phylogenomics_pipeline_workflow/Galaxy-Workflow-Core_SNP_Pipeline.ga.tt`. We can generate the Galaxy-usable workflow file from this template file by using a command similar to:
```bash
$ perl generate_galaxy_workflow.pl --local-toolshed localhost:9009/repos/aaron workflows/core_phylogenomics_pipeline_workflow/Galaxy-Workflow-Core_SNP_Pipeline.ga.tt > workflows/core_phylogenomics_pipeline_workflow/Galaxy-Workflow-Core_SNP_Pipeline.ga
```
Please replace `localhost:9009/repos/aaron` with the location and user of the tools under your local toolshed. Once this Galaxy workflow file has been generated we can either directly upload the workflow to Galaxy instance or upload to a Galaxy tool shed using the below steps.
Upload Workflow to a Tool Shed
------------------------------
The workflow can be uploaded to a local Tool Shed and then installed to Galaxy using the following steps.
### Step 1: Upload Workflow to Tool Shed
1. Run the script `build_for_toolshed.sh`. This will generate a file `build/core_phylogenomics_pipeline_workflow.tar.gz` containing the workflow.
2. In the Galaxy Tool Shed, create a new repository to contain your workflow.
3. From the button at the top right that says **Upload files to repository** please upload the file containing the workflow `build/core_phylogenomics_pipeline_workflow.tar.gz`.
### Step 2: Install Workflow from Tool Shed to Galaxy
1. From the Galaxy instance go to **Admin** and then to **Search and browse tool sheds**.
2. Find the particular tool shed containing your workflow and then find the workflow repository you just uploaded.
3. Install this workflow into Galaxy.
4. Once the workflow is installed, you should be able to view the workflow from **Admin > Maange installed tool shed repositories**.
5. From here you should be able to view the workflow repository details. You should then be able to click on the workflow **Core SNP Pipeline** to view an image of this workflow.
6. Then, click on **Repository Actions** in the top right corner and there should be an option **Import workflow to Galaxy**.
Upload Workflow Directly to a Galaxy Instance
---------------------------------------------
To upload the workflow directly to a running Galaxy instance the following steps can be performed.
1. Log into Galaxy and in the top menu click on **Workflow**.
2. Click on the button to **Upload or import workflow**.
3. Find and browse for the workflow file `Galaxy-Workflow-Core_SNP_Pipeline.ga`.
4. Upload this workflow into Galaxy.
Updating Workflow
-----------------
If you wish to update the workflow, the template file can be generated with a command like:
```bash
$ perl -pe 's/"[^"]+?core_pipeline\//"[% LOCAL_REPOSITORY %]\/core_pipeline\//; s/"[^"]+?smalt_collection\//"[% LOCAL_REPOSITORY %]\/smalt_collection\//; s/"[^"]+?phyml\//"[% LOCAL_REPOSITORY %]\/phyml\//; s/"[^"]+?freebayes\//"[% LOCAL_REPOSITORY %]\/freebayes\//' path/to/Galaxy-Workflow-Core_SNP_Pipeline.ga > workflows/core_phylogenomics_pipeline_workflow/Galaxy-Workflow-Core_SNP_Pipeline.ga.tt
```
[Core Phylogenomis Pipeline]: https://github.com/apetkau/core-phylogenomics
[Galaxy Tool Shed]: https://wiki.galaxyproject.org/ToolShed
[Testing Installed Tools]: https://wiki.galaxyproject.org/TestingInstalledTools
......
......@@ -13,7 +13,13 @@ then
mkdir $BUILD_DIR
fi
tar -C $TOOLS_DIR/core_phylogenomics_pipeline -czf $BUILD_DIR/core_phylogenomics_pipeline.tar.gz .
# build all tools under $TOOLS_DIR
for i in $TOOLS_DIR/*
do
name=`basename $i`
tar -C $TOOLS_DIR/$name -czf $BUILD_DIR/$name.tar.gz .
done
tar -C $WORKFLOWS_DIR/core_phylogenomics_pipeline_workflow -czf $BUILD_DIR/core_phylogenomics_pipeline_workflow.tar.gz .
echo "Successfully built tarballs"
......
#!/usr/bin/env perl
use warnings;
use strict;
use Template;
use Pod::Usage;
use Getopt::Long;
my $pod_sections = "SYNOPSIS|EXAMPLE";
my $local_toolshed;
my $galaxy_template_file;
if (!GetOptions('l|local-toolshed=s' => \$local_toolshed)) {
pod2usage(-verbose => 99, -sections => $pod_sections, -exitval => 1);
}
if (not defined $local_toolshed) {
pod2usage(-message => "no galaxy local toolshed location defined", -sections => $pod_sections, -verbose => 99, -exitval => 1);
}
($galaxy_template_file) = @ARGV;
if (not defined $galaxy_template_file) {
pod2usage(-message=> "no galaxy template file defined", -sections => $pod_sections, -verbose => 99, -exitval => 1);
} elsif (not -e $galaxy_template_file) {
pod2usage(-message=> "galaxy template file $galaxy_template_file does not exist",
-sections => $pod_sections, -verbose => 99, -exitval => 1);
}
my $template = Template->new();
$template->process($galaxy_template_file, { 'LOCAL_REPOSITORY' => $local_toolshed });
=pod
=head1 NAME
generate_galaxy_workflow.pl: Script to generate a Galaxy workflow given the template workflow file.
=head1 SYNOPSIS
=over
=item generate_galaxy_workflow.pl --local-toolshed [local toolshed] [galaxy_workflow_template.ga.tt] > [galaxy_workflow.ga]
=back
=head1 EXAMPLE
=item generate_galaxy_workflow.pl --local-toolshed localhost:9009/repos/aaron Galaxy-Workflow-Core_SNP_Pipeline.ga > workflows/core_phylogenomics_pipeline_workflow/Galaxy-Workflow-Core_SNP_Pipeline.ga.tt
=back
=head1 DESCRIPTION
=over
=item -l|--local-toolshed: A URL to a local toolshed where the local tools will be installed.
=item [galaxy_workflow_template.ga.tt]: The Galaxy workflow template.
=item [galaxy_workflow.ga]: The output Galaxy workflow.
=back
=head1 AUTHORS
Aaron Petkau <aaron.petkau@phac-aspc.gc.ca>
=cut
This diff is collapsed.
#This is a sample file distributed with Galaxy that enables tools
#to use a directory of Samtools indexed sequences data files. You will need
#to create these data files and then create a sam_fa_indices.loc file
#similar to this one (store it in this directory) that points to
#the directories in which those files are stored. The sam_fa_indices.loc
#file has this format (white space characters are TAB characters):
#
#index <seq> <location>
#
#So, for example, if you had hg18 indexed stored in
#/depot/data2/galaxy/sam/,
#then the sam_fa_indices.loc entry would look like this:
#
#index hg18 /depot/data2/galaxy/sam/hg18.fa
#
#and your /depot/data2/galaxy/sam/ directory
#would contain hg18.fa and hg18.fa.fai files:
#
#-rw-r--r-- 1 james universe 830134 2005-09-13 10:12 hg18.fa
#-rw-r--r-- 1 james universe 527388 2005-09-13 10:12 hg18.fa.fai
#
#Your sam_fa_indices.loc file should include an entry per line for
#each index set you have stored. The file in the path does actually
#exist, but it should never be directly used. Instead, the name serves
#as a prefix for the index file. For example:
#
#index hg18 /depot/data2/galaxy/sam/hg18.fa
#index hg19 /depot/data2/galaxy/sam/hg19.fa
phiX174,1411,allele,phiX174,phiX174,A,60,100
phiX174,1412,allele,phiX174,phiX174,G,60,100
phiX174,1413,allele,phiX174,phiX174,C,60,100
phiX174,1414,allele,phiX174,phiX174,G,60,100
phiX174,1415,allele,phiX174,phiX174,C,60,100
phiX174,1416,allele,phiX174,phiX174,C,60,100
phiX174,1417,allele,phiX174,phiX174,G,60,100
phiX174,1418,allele,phiX174,phiX174,T,60,100
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT A
>phiX174
GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGATATTTCTGATGAGTCGAAAAATTATCTT
GATAAAGCAGGAATTACTACTGCTTGTTTACGAATTAAATCGAAGTGGACTGCTGGCGGAAAATGAGAAA
ATTCGACCTATCCTTGCGCAGCTCGAGAAGCTCTTACTTTGCGACCTTTCGCCATCAACTAACGATTCTG
TCAAAAACTGACGCGTTGGATGAGGAGAAGTGGCTTAATATGCTTGGCACGTTCGTCAAGGACTGGTTTA
GATATGAGTCACATTTTGTTCATGGTAGAGATTCTCTTGTTGACATTTTAAAAGAGCGTGGATTACTATC
TGAGTCCGATGCTGTTCAACCACTAATAGGTAAGAAATCATGAGTCAAGTTACTGAACAATCCGTACGTT
TCCAGACCGCTTTGGCCTCTATTAAGCTCATTCAGGCTTCTGCCGTTTTGGATTTAACCGAAGATGATTT
CGATTTTCTGACGAGTAACAAAGTTTGGATTGCTACTGACCGCTCTCGTGCTCGTCGCTGCGTTGAGGCT
TGCGTTTATGGTACGCTGGACTTTGTGGGATACCCTCGCTTTCCTGCTCCTGTTGAGTTTATTGCTGCCG
TCATTGCTTATTATGTTCATCCCGTCAACATTCAAACGGCCTGTCTCATCATGGAAGGCGCTGAATTTAC
GGAAAACATTATTAATGGCGTCGAGCGTCCGGTTAAAGCCGCTGAATTGTTCGCGTTTACCTTGCGTGTA
CGCGCAGGAAACACTGACGTTCTTACTGACGCAGAAGAAAACGTGCGTCAAAAATTACGTGCAGAAGGAG
TGATGTAATGTCTAAAGGTAAAAAACGTTCTGGCGCTCGCCCTGGTCGTCCGCAGCCGTTGCGAGGTACT
AAAGGCAAGCGTAAAGGCGCTCGTCTTTGGTATGTAGGTGGTCAACAATTTTAATTGCAGGGGCTTCGGC
CCCTTACTTGAGGATAAATTATGTCTAATATTCAAACTGGCGCCGAGCGTATGCCGCATGACCTTTCCCA
TCTTGGCTTCCTTGCTGGTCAGATTGGTCGTCTTATTACCATTTCAACTACTCCGGTTATCGCTGGCGAC
TCCTTCGAGATGGACGCCGTTGGCGCTCTCCGTCTTTCTCCATTGCGTCGTGGCCTTGCTATTGACTCTA
CTGTAGACATTTTTACTTTTTATGTCCCTCATCGTCACGTTTATGGTGAACAGTGGATTAAGTTCATGAA
GGATGGTGTTAATGCCACTCCTCTCCCGACTGTTAACACTACTGGTTATATTGACCATGCCGCTTTTCTT
GGCACGATTAACCCTGATACCAATAAAATCCCTAAGCATTTGTTTCAGGGTTATTTGAATATCTATAACA
ACTATTTTAAAGCGCCGTGGATGCCTGACCGTACCGAGGCTAACCCTAATGAGCTTAATCAAGATGATGC
TCGTTATGGTTTCCGTTGCTGCCATCTCAAAAACATTTGGACTGCTCCGCTTCCTCCTGAGACTGAGCTT
TCTCGCCAAATGACGACTTCTACCACATCTATTGACATTATGGGTCTGCAAGCTGCTTATGCTAATTTGC
ATACTGACCAAGAACGTGATTACTTCATGCAGCGTTACCGTGATGTTATTTCTTCATTTGGAGGTAAAAC
CTCTTATGACGCTGACAACCGTCCTTTACTTGTCATGCGCTCTAATCTCTGGGCATCTGGCTATGATGTT
GATGGAACTGACCAAACGTCGTTAGGCCAGTTTTCTGGTCGTGTTCAACAGACCTATAAACATTCTGTGC
CGCGTTTCTTTGTTCCTGAGCATGGCACTATGTTTACTCTTGCGCTTGTTCGTTTTCCGCCTACTGCGAC
TAAAGAGATTCAGTACCTTAACGCTAAAGGTGCTTTGACTTATACCGATATTGCTGGCGACCCTGTTTTG
TATGGCAACTTGCCGCCGCGTGAAATTTCTATGAAGGATGTTTTCCGTTCTGGTGATTCGTCTAAGAAGT
TTAAGATTGCTGAGGGTCAGTGGTATCGTTATGCGCCTTCGTATGTTTCTCCTGCTTATCACCTTCTTGA
AGGCTTCCCATTCATTCAGGAACCGCCTTCTGGTGATTTGCAAGAACGCGTACTTATTCGCCACCATGAT
TATGACCAGTGTTTCCAGTCCGTTCAGTTGTTGCAGTGGAATAGTCAGGTTAAATTTAATGTGACCGTTT
ATCGCAATCTGCCGACCACTCGCGATTCAATCATGACTTCGTGATAAAAGATTGAGTGTGAGGTTATAAC
GCCGAAGCGGTAAAAATTTTAATTTTTGCCGCTGAGGGGTTGACCAAGCGAAGCGCGGTAGGTTTTCTGC
TTAGGAGTTTAATCATGTTTCAGACTTTTATTTCTCGCCATAATTCAAACTTTTTTTCTGATAAGCTGGT
TCTCACTTCTGTTACTCCAGCTTCTTCGGCACCTGTTTTACAGACACCTAAAGCTACATCGTCAACGTTA
TATTTTGATAGTTTGACGGTTAATGCTGGTAATGGTGGTTTTCTTCATTGCATTCAGATGGATACATCTG
TCAACGCCGCTAATCAGGTTGTTTCTGTTGGTGCTGATATTGCTTTTGATGCCGACCCTAAATTTTTTGC
CTGTTTGGTTCGCTTTGAGTCTTCTTCGGTTCCGACTACCCTCCCGACTGCCTATGATGTTTATCCTTTG
AATGGTCGCCATGATGGTGGTTATTATACCGTCAAGGACTGTGTGACTATTGACGTCCTTCCCCGTACGC
CGGGCAATAATGTTTATGTTGGTTTCATGGTTTGGTCTAACTTTACCGCTACTAAATGCCGCGGATTGGT
TTCGCTGAATCAGGTTATTAAAGAGATTATTTGTCTCCAGCCACTTAAGTGAGGTGATTTATGTTTGGTG
CTATTGCTGGCGGTATTGCTTCTGCTCTTGCTGGTGGCGCCATGTCTAAATTGTTTGGAGGCGGTCAAAA
AGCCGCCTCCGGTGGCATTCAAGGTGATGTGCTTGCTACCGATAACAATACTGTAGGCATGGGTGATGCT
GGTATTAAATCTGCCATTCAAGGCTCTAATGTTCCTAACCCTGATGAGGCCGCCCCTAGTTTTGTTTCTG
GTGCTATGGCTAAAGCTGGTAAAGGACTTCTTGAAGGTACGTTGCAGGCTGGCACTTCTGCCGTTTCTGA
TAAGTTGCTTGATTTGGTTGGACTTGGTGGCAAGTCTGCCGCTGATAAAGGAAAGGATACTCGTGATTAT
CTTGCTGCTGCATTTCCTGAGCTTAATGCTTGGGAGCGTGCTGGTGCTGATGCTTCCTCTGCTGGTATGG
TTGACGCCGGATTTGAGAATCAAAAAGAGCTTACTAAAATGCAACTGGACAATCAGAAAGAGATTGCCGA
GATGCAAAATGAGACTCAAAAAGAGATTGCTGGCATTCAGTCGGCGACTTCACGCCAGAATACGAAAGAC
CAGGTATATGCACAAAATGAGATGCTTGCTTATCAACAGAAGGAGTCTACTGCTCGCGTTGCGTCTATTA
TGGAAAACACCAATCTTTCCAAGCAACAGCAGGTTTCCGAGATTATGCGCCAAATGCTTACTCAAGCTCA
AACGGCTGGTCAGTATTTTACCAATGACCAAATCAAAGAAATGACTCGCAAGGTTAGTGCTGAGGTTGAC
TTAGTTCATCAGCAAACGCAGAATCAGCGGTATGGCTCTTCTCATATTGGCGCTACTGCAAAGGATATTT
CTAATGTCGTCACTGATGCTGCTTCTGGTGTGGTTGATATTTTTCATGGTATTGATAAAGCTGTTGCCGA
TACTTGGAACAATTTCTGGAAAGACGGTAAAGCTGATGGTATTGGCTCTAATTTGTCTAGGAAATAACCG
TCAGGATTGACACCCTCCCAATTGTATGTTTTCATGCCTCCAAATCTTGGAGGCTTTTTTATGGTTCGTT
CTTATTACCCTTCTGAATGTCACGCTGATTATTTTGACTTTGAGCGTATCGAGGCTCTTAAACCTGCTAT
TGAGGCTTGTGGCATTTCTACTCTTTCTCAATCCCCAATGCTTGGCTTCCATAAGCAGATGGATAACCGC
ATCAAGCTCTTGGAAGAGATTCTGTCTTTTCGTATGCAGGGCGTTGAGTTCGATAATGGTGATATGTATG
TTGACGGCCATAAGGCTGCTTCTGACGTTCGTGATGAGTTTGTATCTGTTACTGAGAAGTTAATGGATGA
ATTGGCACAATGCTACAATGTGCTCCCCCAACTTGATATTAATAACACTATAGACCACCGCCCCGAAGGG
GACGAAAAATGGTTTTTAGAGAACGAGAAGACGGTTACGCAGTTTTGCCGCAAGCTGGCTGCTGAACGCC
CTCTTAAGGATATTCGCGATGAGTATAATTACCCCAAAAAGAAAGGTATTAAGGATGAGTGTTCAAGATT
GCTGGAGGCCTCCACTATGAAATCGCGTAGAGGCTTTACTATTCAGCGTTTGATGAATGCAATGCGACAG
GCTCATGCTGATGGTTGGTTTATCGTTTTTGACACTCTCACGTTGGCTGACGACCGATTAGAGGCGTTTT
ATGATAATCCCAATGCTTTGCGTGACTATTTTCGTGATATTGGTCGTATGGTTCTTGCTGCCGAGGGTCG
CAAGGCTAATGATTCACACGCCGACTGCTATCAGTATTTTTGTGTGCCTGAGTATGGTACAGCTAATGGC
CGTCTTCATTTCCATGCGGTGCATTTTATGCGGACACTTCCTACAGGTAGCGTTGACCCTAATTTTGGTC
GTCGGGTACGCAATCGCCGCCAGTTAAATAGCTTGCAAAATACGTGGCCTTATGGTTACAGTATGCCCAT
CGCAGTTCGCTACACGCAGGACGCTTTTTCACGTTCTGGTTGGTTGTGGCCTGTTGATGCTAAAGGTGAG
CCGCTTAAAGCTACCAGTTATATGGCTGTTGGTTTCTATGTGGCTAAATACGTTAACAAAAAGTCAGATA
TGGACCTTGCTGCTAAAGGTCTAGGAGCTAAAGAATGGAACAACTCACTAAAAACCAAGCTGTCGCTACT
TCCCAAGAAGCTGTTCAGAATCAGAATGAGCCGCAACTTCGGGATGAAAATGCTCACAATGACAAATCTG
TCCACGGAGTGCTTAATCCAACTTACCAAGCTGGGTTACGACGCGACGCCGTTCAACCAGATATTGAAGC
AGAACGCAAAAAGAGAGATGAGATTGAGGCTGGGAAAAGTTACTGTAGCCGACGTTTTGGCGGCGCAACC
TGTGACGACAAATCTGCTCAAATTTATGCGCGCTTCGATAAAAATGATTGGCGTATCCAACCTGCA
<!-- Use the file tool_data_table_conf.xml.oldlocstyle if you don't want to update your loc files as changed in revision 4550:535d276c92bc-->
<tables>
<!-- Location of SAMTools indexes and other files -->
<table name="sam_fa_indexes" comment_char="#">
<columns>line_type, value, path</columns>
<file path="tool-data/sam_fa_indices.loc" />
</table>
</tables>
<?xml version="1.0"?>
<tool_dependency>
<package name="freebayes" version="0.9.9_296a0fad5b73377e0c4498c65d53ad743225d6b9">
<install version="1.0">
<actions>
<action type="shell_command">git clone --recursive git://github.com/ekg/freebayes.git</action>
<action type="shell_command">git checkout 296a0fad5b73377e0c4498c65d53ad743225d6b9</action>
<action type="shell_command">git submodule update --recursive</action>
<action type="shell_command">make || ( make clean &amp;&amp; sed -i.bak -e &apos;s:LIBS = -lz -lm -L./ -L../vcflib/tabixpp/ -L$(BAMTOOLS_ROOT)/lib -ltabix:LIBS = -lm -L./ -L../vcflib/tabixpp/ -L$(BAMTOOLS_ROOT)/lib -ltabix -lz:g&apos; src/Makefile &amp;&amp; make )</action>
<action type="move_directory_files">
<source_directory>bin</source_directory>
<destination_directory>$INSTALL_DIR/bin</destination_directory>
</action>
<action type="set_environment">
<environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable>
</action>
</actions>
</install>
<readme>
FreeBayes requires g++ and the standard C and C++ development libraries.
Additionally, cmake is required for building the BamTools API.
</readme>
</package>
</tool_dependency>
#!/bin/bash
input_tree=$1
output_tree=$2
output_stat=$3
tree_suffix=_phyml_tree.txt
stat_suffix=_phyml_stats.txt
shift 3
directory=`dirname $0`
phyml_3.1 -i $input_tree $*;
mv ${input_tree}${tree_suffix} ${output_tree};
mv ${input_tree}${stat_suffix} ${output_stat};
<tool id="phyml1" name="PhyML" version="3.1">
<requirements>
<requirement type="package" version="3.1">phyml</requirement>
</requirements>
<description>, a ML tree builder</description>
<command interpreter="bash">./phyml.sh $input $output_tree $output_stats -d $datatype_condition.type -m $datatype_condition.model -v $prop_invar -s $search
#if $datatype_condition.type == "nt":
-t $datatype_condition.tstv
#end if
#if $gamma_condition.gamma == "yes":
-c $gamma_condition.categories -a $gamma_condition.shape
#else:
-c 1
#end if
#if $support_condition.support == "sh":
-b -4
#end if
#if $support_condition.support == "boot":
-b $support_condition.boot_number
#end if
#if $support_condition.support == "no":
-b 0
#end if
#if $random_condition.random == "yes":
--rand_start 0 --n_rand_starts $random_condition.points
#end if
--quiet
</command>
<inputs>
<param format="phylip" name="input" type="data" label="Alignment in phylip format" help="Alignment in phylip format"/>
<conditional name="datatype_condition">
<param type="select" name="type" label="Data type">
<option value="nt">Nucleic acids</option>
<option value="aa">Amino acids</option>
</param>
<when value="nt">
<param type="select" name="model" label="Evolution model">
<option value="HKY85">HKY85</option>
<option value="JC69">JC69</option>
<option value="K80">K80</option>
<option value="F81">F81</option>
<option value="F84">F84</option>
<option value="TN93">TN93</option>
<option value="GTR">GTR</option>
</param>
<param type="text" name="tstv" help="Must be a positive integer, 'e' if you want PhyML to estimate it" value="e" label="Transition/transversion ratio" />
</when>
<when value="aa">
<param type="select" name="model" label="Evolution model">
<option value="LG">LG</option>
<option value="WAG">WAG</option>
<option value="JTT">JTT</option>
<option value="MtREV">MtREV</option>
<option value="Dayhoff">Dayhoff</option>
<option value="DCMut">DCMut</option>
<option value="RtREV">RtREV</option>
<option value="CpREV">CpREV</option>
<option value="VT">VT</option>
<option value="Blosum62">Blosum62</option>
<option value="MtMam">MtMam</option>
<option value="MtArt">MtArt</option>
<option value="HIVw">HIVw</option>
<option value="HIVb">HIVb</option>
</param>
</when>
</conditional>
<conditional name="gamma_condition">
<param type="select" name="gamma" label="Discrete gamma model">
<option value="yes">Use a gamma model</option>
<option value="no">Don't use a gamma model</option>
</param>
<when value="yes">
<param type="text" name="categories" help="1 significates no gamma model" value="4" label="Number of categories for the discrete gamma model" />
<param type="text" name="shape" help="'e' if you want PhyML to estimate it" value="e" label="Shape parameter of the gamma model" />
</when>
<when value="no">
</when>
</conditional>
<conditional name="support_condition">
<param type="select" name="support" label="Branch support">
<option value="sh">SH-like aLRT</option>
<option value="boot">Bootstrap</option>
<option value="no">No branch support</option>
</param>
<when value="sh">
</when>
<when value="boot">
<param type="text" name="boot_number" help="Must be a positive integer" value="100" label="Number of bootstrap replicate" />
</when>
<when value="no">
</when>
</conditional>
<param type="text" name="prop_invar" help="0.0 to ignore this parameter, 'e' if you want PhyML to estimate it" value="0.0" label="Proportion of invariant sites" />
<param type="select" name="search" label="Tree topology search operation">
<option value="NNI">NNI (Nearest Neighbor Interchange)</option>
<option value="SPR">SPR (Subtree Pruning and Regraphing)</option>
<option value="BEST">Best of NNI and SPR</option>
</param>
<conditional name="random_condition">
<param type="select" name="random" label="Random starting points">
<option value="no">Don't add random starting points</option>
<option value="yes">Add random starting points</option>
</param>
<when value="yes">
<param type="text" name="points" help="A number greater than 0" value="4" label="Number of random starting points" />
</when>
<when value="no">
</when>
</conditional>
</inputs>
<outputs>
<data format="txt" name="output_tree" label="Newick Tree"/>
<data format="txt" name="output_stats" label="Phyml statistics output"/>
</outputs>
<help>
.. class:: infomark
**Program encapsulated in Galaxy by Southgreen**
.. class:: infomark
**PhyML version 3.0, 2010**
-----
==============
Please cite:
==============
"New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0", **Guindon S., Dufayard JF., Anisimova M., Hordijk W., Lefort V., Gascuel O.**, Systematic Biology 2010 59(3):307-321; doi:10.1093/sysbio/syq010.
"A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.", **Guindon S., Gascuel O.**, Systematic Biology, 52(5):696-704, 2003.
-----
===========
Overview:
===========
PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm to perform Nearest Neighbor Interchanges (NNIs), in order to improve a reasonable starting tree topology. Since the original publication (Guindon and Gascuel 2003), PhyML has been widely used (~4,000 citations in ISI Web of Science), due to its simplicity and a fair accuracy/speed compromise. In the mean time research around PhyML has continued.
We designed an efficient algorithm to search the tree space using Subtree Pruning and Regrafting (SPR) topological moves (Hordijk and Gascuel 2005), and proposed a fast branch test based on an approximate likelihood ratio test (Anisimova and Gascuel 2006). However, these novelties were not included in the official version of PhyML, and we found that improvements were still needed in order to make them effective in some practical cases. PhyML 3.0 achieves this task.
It implements new algorithms to search the space of tree topologies with user-defined intensity. A non-parametric, Shimodaira-Hasegawa-like branch test is also available. The program provides a number of new evolutionary models and its interface was entirely re-designed. We tested PhyML 3.0 on a large collection of real data sets to ensure that the new version is stable, ready-to-use and still reasonably fast and accurate.
-----
For further informations, please visite the PhyML_ website.
.. _PhyML: http://www.atgc-montpellier.fr/phyml/
</help>
</tool>
<?xml version="1.0"?>
<tool_dependency>
<package name="phyml" version="3.1">
<install version="1.0">
<actions>
<action type="download_by_url">http://www.atgc-montpellier.fr/download/binaries/phyml/PhyML-3.1.zip</action>
<action type="shell_command">mv PhyML-3.1_linux64 phyml_3.1</action>
<action type="move_file">
<source>phyml_3.1</source>
<destination>$INSTALL_DIR/bin</destination>
</action>
<action type="chmod">
<file mode="750">$INSTALL_DIR/bin/phyml_3.1</file>
</action>
<action type="set_environment">
<environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable>
</action>
</actions>
</install>
<readme>
FreeBayes requires g++ and the standard C and C++ development libraries.
Additionally, cmake is required for building the BamTools API.
</readme>
</package>
</tool_dependency>
#/bin/bash
#determine if we were given 1 or 2 fastq files
if [ $# -eq 3 ] || [ $# -eq 2 ]; then
#get the output file name
output=$1
#remove it from the arguments list
shift
$SMALT/smalt check $@ > $output
#remove header file
sed -i -e "1d" $output
else
#unknown arguments given
exit 1
fi
exit 0
<tool id="smalt_check" name="smalt check" version="0.0.3" >
<requirements>
<requirement type="package" version="1.1">smalt</requirement>
</requirements>
<command interpreter="bash">
smalt_check.sh $output
#if $singlePaired.sPaired == "single":
$singlePaired.sInput1
#elif $singlePaired.sPaired == "paired":
$singlePaired.pInput1 $singlePaired.pInput2
#elif $singlePaired.sPaired == "matePairs":
$singlePaired.mInput1 $singlePaired.mInput2
#end if
</command>
<description>Determine number of reads in fastq(s) and if they are mate-pair</description>
<inputs>
<conditional name="singlePaired">
<param name="sPaired" type="select" label="Is this library mate-paired?">
<option value="single">Single-end</option>
<option value="paired">Paired-end</option>
<option value="matePairs">MatePairs</option>
</param>
<when value="single">
<param name="sInput1" type="data" format="fastq" label="Single end illumina fastq file" optional="false"/>
</when>
<when value="paired">
<param name="pInput1" type="data" format="fastq,fastqsanger,fastqillumina,fastqsolexa" label="Forward FASTQ file" help="Must have ASCII encoded quality scores"/>
<param name="pInput2" type="data" format="fastq,fastqsanger,fastqillumina,fastqsolexa" label="Reverse FASTQ file" help="File format must match the Forward FASTQ file"/>
</when>
<when value="matePairs">
<param name="mInput1" type="data" format="fastq,fastqsanger,fastqillumina,fastqsolexa" label="Forward FASTQ file" help="Must have ASCII encoded quality scores"/>
<param name="mInput2" type="data" format="fastq,fastqsanger,fastqillumina,fastqsolexa" label="Reverse FASTQ file" help="File format must match the Forward FASTQ file"/>
</when>
</conditional>
</inputs>
<outputs>
<data format="tabular" name="output" />
</outputs>
<stdio>
<exit_code range="1" level="fatal" description="Unknown argument given." />
<exit_code range="2:" level="fatal" description="Unknown error." />
</stdio>
<help>
**What it does**
Check FASTA/FASTQ read files. If &#060;mate_file&#062; is specified, the reads are in pairs.
------
Please cite the website "http://www.sanger.ac.uk/resources/software/smalt/".
</help>
</tool>
<tool id="smalt_index" name="smalt index" version="0.0.3">
<requirements>
<requirement type="package" version="1.1">smalt</requirement>
</requirements>
<command>
/\$SMALT/smalt index
#if $k:
-k "$k"
#end if
#if $s:
-s "$s"
#end if
'temp' "$reference"
</command>
<description>Index a reference </description>
<inputs>
<param name="reference" type="data" format="fasta" label="Fasta reference file"/>
<param name="k" type="integer" value="13" label="K-mer size" help="Specifies the word length. [wordlen] is an integer within the limits. between 2 and less then 20. The default word length is 13"/>
<param name="s" type="text" label="Stepsiz" help="Specifies how many bases are skipped between indexed words."/>
</inputs>
<outputs>
<data name="output" label="SMI" from_work_dir="temp.smi"/>
<data name="output2" label="SMA" from_work_dir="temp.sma"/>
</outputs>