Managing Samples

Each project in IRIDA may contain a collection of samples that corresponds to an isolate. Each sample may contain one or more of the following types of files: sequencing files in paired-end or single-end format, or assembled genomes. This section of the user guide describes how you can view samples, manage samples (merging, copying, renaming, exporting), and search for samples by name.

Viewing samples in a project

Start by viewing the project details of a project. The list of samples in the project is shown in the middle of the
project details screen:

Project samples listing.

The samples listing shows high-level sample details, such as:

Viewing individual sample details

All of the sample details that are in IRIDA are currently provided by a user with the project Manager role. To view details about an individual sample, start by viewing the samples in a project, then click on the sample name in the samples table:

Sample name button.

The sample details page shows all of the details that are currently known about a sample:

Sample details page.

Editing sample details

Start by viewing the details of an individual sample. On the samples details page, click on the “Edit” button in the top, right-hand corner:

Sample details edit button.

You can provide as many or as few sample details that you want – the sample details are not used by any workflows in IRIDA (except the sample name in the SNVPhyl workflow), and (with the exception of the sample name) none of the sample details are required fields. When you’ve finished updating the sample details, you can click on the “Update” button at the bottom, right-hand side of the page.

Viewing contained files

Samples can contain different types of files, either Sequence Files which are produced by a sequencing instrument, or Assemblies which consist of the re-constructed genome from the sequence reads.

sample-contained-files

Viewing Sequence Files

Start by viewing the details of an individual sample. On the sample details page, click on the “Files” tab, just above the sample details panel:

Sample details file tab.

Sequence files may have been uploaded as paired-end files or as single-end files, depending on how the isolate was sequenced.

Single-end files will appear in the sample alone:

Single-end sequencing file.

Paired-end files will appear in a pair:

Paired-end sequencing file.

Quality control information for a sequence file may appear below the file:

File QC

Uploading Sequence Files

Sequence files can be uploaded by clicking on the “Upload Sequence Files” button, on the left hand side side of the sequence file table. Files must have the extension .fastq or .fastq.gz, all other formats will be ignored.

Upload sequence file.

You can select single or multiple files in the system file selection window.

Upload File Selection.

Files will upload immediately and a progress bar will be displayed. If you need to cancel an upload click the Cancel Upload Button button.

Cancel upload

Downloading a sequence file

You can download a sequence file by clicking on the icon, on the right-hand side of the row for the sequence file.

You can download all sequence files in a sample by following the instructions in the exporting samples section about downloading samples.

Deleting a sequence file

If you need to delete a sequence file from IRIDA, you can do so by clicking on the Delete icon icon, on the right-hand side of the row for the sequence file.

You can only delete a sequence file from a sample if you have the project Manager role icon. Manager role on the project.

Concatenating sequence files

In cases where a top-up run or any other additional data is added to a sample, you may want to combine the sequence files into a single concatenated file. IRIDA allows you to do this under the Concatenate Files page.

Concatenate link

In the concatenation page you must select 2 or more sequence file objects of the same type to concatenate. If you have selected a collection of files which cannot be concatenated, a warning will be displayed.

Concatenate page

Once you have selected your files to concatenate, you have the following options:

Once you have selected your files and selected your options, click Submit to begin the concatenation. This may take a while, so you should stay on this page until the process is complete. Once your files are concatenated, you will be redirected back to the sample-files page.

Viewing genome assemblies

Samples can also contain assembled genomes.

sample-automated-assembly

Genome assemblies can be linked to samples in two ways:

  1. By enabling automated assemblies, which will be triggered on upload of sequencing files in the appropriate project.
  2. Or by selecting the option to save assemblies back to a sample from the Launch Pipelines page.

The assembled genome file can be downloaded by clicking the icon.

Deleting genome assemblies

Assembled genomes may be deleted from a sample by selecting the Delete icon icon.

delete-sample-assembly

Viewing automated assemblies

If the project manager has enabled automated assemblies for uploaded data an assembly will be shown associated with the particular sequence files used to generate the assembled genome.

Automated assembly

The assembly status will be displayed along with a link to view the assembly results page. On completion, the assembled genome will be saved back to the Sample. For more information on viewing pipeline results see the pipeline documentation

See the project documentation for information on enabling automated assembly.

Adding a new sample

You can add a new sample to the project if you have the project Manager role on the project. To add a new sample to the project, click on the “Add New Sample” button in the “Samples” menu:

New sample button

Clicking this button will take you to the Create New Sample page. When creating a sample, you must define the sample name (only upper and lowercase letters, numbers, and the special characters !, @, #, $, %, _, -, and ` are allowed) and optionally choose an organism for the sample:

Create new sample palge

If you choose to set a sample organism, click on the “Organism” drop-down menu and begin typing the name of the organism. For example, if you wanted to specify a sample organism of “Escherichia coli O26:NM”, you would begin to type “Esc” and the menu would allow you to choose from a set of well-defined organism names:

Taxonomic terms

When you’ve finished choosing the name and organism for the sample, click on the “Create Sample” button.

Create Sample

Searching and filtering samples

You can search and filter samples in a project in IRIDA by sample name, organism, and/or date range using the filters at the top of the samples list:

Samples filter area.

Search Field

Samples search input.

You can perform a general search on sample names using the search field. This will filter samples that have the search string anywhere in the name or organism field. So, for example, if you’re searching for a sample that has the numeral 2 in its name, enter 2 into the search input, and you would find samples with names like:

Advanced Filtering

Samples advanced filters.

Clicking the filter button opens a dialog where you can filter by sample name and / or date modified.

Samples advanced filter dialogue.

Filtering by sample name will match the same as the search field, so the filter name will match anywhere in the sample name.

Samples advanced filter dialogue daterange.

To search sample by a date range, click on the date range field. A drop down will be displayed with pre-determine ranges:

Or you can enter a custom date range by selecting the dates in the calendar.

Samples advanced filter dialogue apply.

To apply the selected filters click the ‘Filter’ button.

Samples advanced filter applied state.

Once the filter is applied, the samples table will be updated with the filtered samples. When an advanced filter is applied, a tag is created below the filter button to allow the user to know what filters are currently applied. To remove a specific filter click on the tag itself.

Clearing Filters

Samples clear filters button.

To clear all currently applied filters and search, click on the clear button to the right of the filter area.

Filtering and Selecting by File

As projects become larger, it becomes unwieldy to select a large subset of samples. To facilitate this, there is the ‘Filter by File’ option.

Example (project_5_filter.txt):

03-3333
10-6966
15-7569

Filter by File Button

If all sample names are found, a green success notification will appear in the upper right corner of the window. This notification will disappear after 2 seconds.

Filter by File all found

If sample names are not found, the samples will be filtered by the available names and a notification will appear telling you which samples could not be found. This notification will not go away until it is clicked.

Example. If my file contained and additional sample name 12-4598_a which does not exist the following will be displayed.

Filter by File missing samples

Viewing associated samples

You can quickly create an aggregated view of all of the samples in this project with all of the samples from both local and associated projects. To view associated samples, click the “Associated Projects” button. All projects associated with the current project will be displayed here. Select the projects you would also like to see in the view. Project managers may choose which samples will appear here by adding or removing associated projects.

Sample type selector

Associated samples will be displayed in the project samples table designated with the same colours.

Sample table with associated and remote samples

Modifying samples

Only user accounts that have the Manager role on a project can modify the samples in a project.

Selecting samples

All sample modification actions require that samples be selected. You can select individual samples by clicking anywhere on the row (except on the sample name itself):

Selected sample.

Multiple Sample Selection

You can also select multiple samples at once by selecting a sample, pressing the shift key, and than selecting the last sample that you want selected.

You can always see how many samples are selected at the top left of the samples table.

Selected sample counts.

Selecting Groupings of Samples

All samples in the project can be selected at once using the checkbox in the table header.

Select All Checkbox

Selected sample counts.

Alternatively, there is a dropdown next to the select all checkbox that allows you to select/deselect samples in the entire project, or on the current page of the sample table.

Selected sample counts.

Sharing samples between projects

Samples may be shared between projects. A sample that is shared into multiple projects is effectively linked between those projects – the files contained within the sample are not physically duplicated, and any sample metadata changes in one project are reproduced in the sample for all projects.

You must be a project Manager on both the project that you are sharing the sample from, and the project that you are sharing the sample to.

Start by selecting the samples that you want to share with another project. When you’ve selected the samples that you want to share, click on the “Samples” button just above the samples list, and select “Share”:

Share samples button.

In the dialog that appears you will be presented with a list of the samples that are going to be shared, and an option to choose the project that the samples should be shared to:

Copy samples dialog.

When you click on the drop-down box to select a project, you can either visually find the project that you want, or you can filter the projects by their name by typing into the text field.

After selecting the project to share samples to, you may select whether users on the new project should have modification access to those samples. If users on the new project should not be able to modify the samples, uncheck the Allow modification access to samples in the new project box. If you have selected samples that are non-modifiable in your current project you will be unable to give modification access to any of the samples you have selected.

Locked Sample.

Once you’ve selected the project that you want to share the samples to, click on the “Share Samples” button.

Moving samples between projects

An alternative to sharing samples between projects is to move a sample between projects. Unlike sharing, when a sample is moved, the original sample is removed.

Like sharing samples, you must be a project Manager role icon. Manager on both the project that you are moving the sample from, and the project that you are moving the sample to. In addition, the source project you are sharing from must not be a remote project.

Start by selecting the samples that you want to move to the other project. When you’ve selected the samples that you want to move, click on the “Samples” button just above the samples list and select “Move Samples”:

Move samples button.

In the dialog that appears you will be presented with a list of the samples that are going to be moved, and an option to choose the project that the samples should be moved to:

Move samples dialog.

When you click on the drop-down box to select a project, you can either visually find the project that you want, or you can filter the projects by their name by typing into the text field.

If you have selected samples that are non-modifiable in your current project, you will be shown a warning that the samples you are moving will also be non-modifiable in the new project.

Once you’ve selected the project that you want to move the samples to, click on the “Move Samples” button.

Merging samples within a project

If a sample was created when sequencing data was uploaded with an incorrect name, you may want to merge two samples together. When you merge two samples, you will move all of the sequencing files and assembled genomes from one sample to another, then delete the original sample. None of the sample metadata will be copied between the merged samples, instead you will select one sample as the target for the sample merge. Only users with the project Manager role icon. Manager role can merge samples in a project and samples cannot be merged within remote projects.

Start by selecting the samples that you want to merge. You must select more than one sample to enable the merge samples button. Once you’ve selected the two or more samples that you would like to merge, click on the “Samples” button just above the samples list and select “Merge Samples”:

Merge samples button.

In the dialog that appears you will be presented with a list of the samples that are going to be merged, and an option to choose the target sample of the merge:

Merge samples dialog.

Click on the sample name under “Select a sample to merge into” to choose which sample will be used as the target for all of the sequencing data.

You may also (optionally) rename the target sample by entering a new sample name under “Rename sample”. The sample name must be at least 3 characters long, and must not contain white space characters (tab or space) or any of the following characters: ? ( ) [ ] / \ = + < > : ; " , * ^ | & ' .. If you do not want to rename the target sample, leave this field blank.

Once you’ve finished choosing the sample to merge into, click on the “Complete Merge” button at the bottom of the dialog.

Note if you select samples that are non-modifiable, a warning will be displayed that you cannot merge the selected samples.

Exporting samples

The pipelines available in IRIDA may not be enough for the types of analysis that you want to run on your sequencing data. You can export your sample data from IRIDA in a number of different ways:

  1. Downloading samples,
  2. To the command-line, or
  3. Directly to Galaxy
  4. Upload to NCBI

All export options require that you select the samples for export before you are able to export the samples.

Tip: For all types of export, you can export all of the data in a project using the Select All feature.

Downloading samples

You can download an individual sequence file from a sample by navigating to the file, then clicking on the icon (see: Downloading a sequence file).

You may download all of the files in a sample, or even download the files from multiple samples, by selecting the samples that you want to download, clicking on the “Export” button just above the samples list and clicking on “Download”:

Samples download button.

IRIDA will provide you with a zip file containing the sequencing data for all of the selected samples. You can extract the files from the zip archive using the command-line program unzip, using the built-in Windows extractor tool, or using a program like 7-zip

WARNING: sequencing data can make for a very large download, especially when downloading all of the sequencing data for a project. We strongly recommend that you do not download data to your PC, especially if you are going to be using Linux command-line tools and the command-line export tool option is available.

Command-line export

The IRIDA package comes with a Linux command-line utility for linking to files in your current working directory. If you are working on a Linux workstation, we strongly encourage you to use the command-line utility for working with the sequencing data stored in IRIDA.

Start by selecting the samples that you want to export to the command-line, clicking on the “Export” button just above the samples list and clicking “Command-line Linker”:

Command-line linker button.

The dialog that appears will provide you with a command that you can copy and paste into a terminal window:

Command-line linker dialog.

Copy and paste the command into a terminal window and use the username and password that you use to log in to IRIDA:

[user@waffles ~]$ ngsArchiveLinker.pl -p 2 -s 5
Writing files to /home/user
Enter username: user
Enter password: 
Reading samples 5 from project 2
Created 2 files for 1 samples in /home/user/Project
[user@waffles ~]$

The folder structure that will be created in the current working directory will match the structure present in IRIDA:

[user@waffles ~]$ tree Project/
Project/
└── sample-1
    ├── sample-1_S1_L001_R1_001.fastq -> /opt/irida/sequence-files/1/sample-1_S1_L001_R1_001.fastq
    └── sample-1_S1_L001_R2_001.fastq -> /opt/irida/sequence-files/2/sample-2_S1_L001_R2_001.fastq

1 directory, 2 files

Importantly, the files that are stored in your directory structure are links and not copies of the files. The purpose of links is to reduce the use of disk space on shared resources. An unfortunate side effect of the link structure is that you cannot change the contents of the files.

Galaxy export

Samples can also be exported directly to Galaxy. Samples exported from IRIDA into Galaxy are loaded into a Galaxy data library that can be easily shared with multiple Galaxy users.

To export data from IRIDA to Galaxy, start in Galaxy and find the “IRIDA server” tool in the “Get Data” section:

IRIDA server import tool.

If you are not already logged into IRIDA, you will be required to log in using your IRIDA username and password:

IRIDA login.

After you log in to IRIDA (or if you were already logged in), you will be directed to the list of projects that you have permission to view. Choose the project containing the samples you wish to export:

Galaxy IRIDA projects list.

Navigate to the list of samples that you’re interested in exporting by clicking on the project name. Then, select the samples that you want to export, click on the “Export” button just above the samples list, then click “Send to Galaxy”:

Export to Galaxy button.

The dialog that appears will allow you to choose the e-mail address that should be assigned ownership of the data library. The e-mail address should be the e-mail address that you use as your username in Galaxy. You may also choose the name of the data library that the sequencing data should be exported to:

Export to Galaxy dialog.

You will also be presented with two optional methods of exporting your data; you can choose whether or not you want your exported data to show up in your Galaxy history by using the “Add samples to history” checkbox. If you opt to show your data in your Galaxy history, you can additionally specify whether or not you want your data to be organized into collections of paired items, depending on your use case.

After you’ve entered your e-mail address and the name of the data library, click the “Upload Samples” button. You will be redirected back into Galaxy and a new history item will appear (if you opted to show your exported data in your history):

Export to Galaxy history item.

Additionally, if you opted to organize your data into collections of paired items, you will see the collections in your history:

Export to Galaxy history item.

You can view a report of the exported samples by clicking on the name of the history item. You can find your data library by clicking on “Shared Data” at the top of Galaxy and clicking on “Data Libraries”:

Galaxy data libraries button.

NCBI Upload

IRIDA can assist in uploading sequence files to NCBI’s Sequence Read Archive. IRIDA requires that BioProjects and BioSamples be created before uploading, and will assign uploaded sequence files to the given BioProject and BioSample identifiers. More information about the metadata which must be entered during the upload process can be found at NCBI Submission Quick Start Guide.

To begin submitting sequence files, select which samples you want to upload from the project samples page, then click the Export and Upload to NCBI SRA button.

Upload NCBI samples button

You will be forwarded to a page where you must enter metadata about the uploaded files. Start by entering information about the upload:

NCBI project metadata

Next you must fill in information about the samples to be uploaded. For more detailed information about these fields see NCBI’s SRA Handbook (Library Information, Sequencing Platform Description).

After entering this metadata you can select which files should be uploaded from each sample. Only files selected with checkboxes will be uploaded to NCBI.

NCBI sample metadata

Click the Submit at the bottom of the page when the information is complete.

After submitting you will be redirected to a page showing the information you have entered for the upload and the status of the upload. IRIDA will periodically check the status of uploads in the SRA and update their status as necessary. After NCBI has assigned an accession number to your upload it will be displayed on this page.

NCBI submission details

Previous: Managing ProjectsNext: Launching Pipelines