Using the SISTR Pipeline

This guide describes how to make use of the Salmonella in-silico Typing Resource (SISTR) pipeline within IRIDA. This pipeline enables the identification of the serovar and cgMLST types for Salmonella whole genome sequencing (WGS) data through comparisons with a large database (10,000+) of Salmonella genomes within NCBI.

Pipeline Overview

The SISTR pipeline that is implemented within IRIDA makes use of the following steps to translate WGS data into typing information:

  1. Paired-end reads are merged with FLASH.
  2. Merged and un-merged reads are assembled de novo using SPAdes.
  3. Low-coverage and small contigs are removed from the generated assembly.
  4. The assembled genome is passed to sistr_cmd, a command-line program for comparing genomes against the SISTR database.

Running the Pipeline

The SISTR pipeline can bet set up to run using two separate methods.

1. Automated Execution

The SISTR pipeline can be set to run automatically on upload of new sequencing data to particular projects. This can be set either on the creation of a new project, or from the project settings page.

a. Creation of New Project

On creation of a new project, the below option can be selected to enable the automated execution of SISTR on upload of new data to this project.


b. Project Settings

Automated SISTR analysis can also be enabled (or disabled) after a project is created from the project Settings page.


If automated execution of SISTR has been enabled for a project, then a SISTR analysis will be scheduled for execution on the upload of new sequencing data. The results are accessible from the particular sample page.


Clicking the Automated SISTR Typing link brings you to the appropriate analysis page for SISTR.


2. Manual Execution

To execute SISTR manually, please refer to the IRIDA/SISTR Tutorial.

SISTR Results

Status of PASS

A successfull SISTR run (with status of PASS) should produce the following page as output.


Status of WARNING

A SISTR run with a status of WARNING should produce the below output.


Status of FAIL

An unsuccessfull SISTR run (with status of FAIL) should produce the following as output.



Interpretation of the produced output is as follows:

1. SISTR Information

Basic information on the sample and quality of the SISTR results.

2. Serovar Predictions

The in silico serovar predictions generated from SISTR.

3. cgMLST330

The results of additional predictions made using the SISTR cgMLST330 schema.

4. Mash

The results of predictions made through comparisons using the software Mash. Generally, cgMLST results are preferred over Mash.

Output Files

In addition to the report, the SISTR pipeline produces the following files available for download.

More information on the interpretation of these files is available on the sistr_cmd page.