The Arteria Project

Join the chat at https://gitter.im/arteria-project/arteria-project

Why?

Handling sequencing data from massive parallel sequencing can be a daunting task! And while the process of handling sequencing data will share many of its characteristics across centers, the current norm is one center one solution. This creates a situation where reuse is difficult to achieve and the wheel is invented over and over again. This is a situation that we hope can be remedied in the form of the Arteria project.

What is this?

The Arteria project provides components to automate analysis and data-management tasks at a next-generation sequencing center. It leverages a micro-service based architecture together with StackStorm to create an event-driven automation system, which is both flexible and scalable.

Arteria has two main components:

StackStorm packs

StackStorm’s concept of packs means that it’s easy to redistribute automation components, and stitch these toghether into workflows. The Arteria packs can be found here: https://github.com/arteria-project/arteria-packs. This provides an excellent starting point for building your own Arteria system.

Single responsibility micro-services

These provide different functionality, such as running the Illumina bcl2fastq program, check if a runfolder is ready to be analyzed, or remove data once certain criteria are met. The separation of responsibilities between services means that you can pick and chose which ones suite your particular workflow and infrastructure configuration.

The catalog of services of general interest currently includes:

arteria-runfolder
Manages the state of an illumina runfolder by monitoring the status files output by the instrument. It also allows for state to be changed via the REST interface.

arteria-bcl2fastq
Provides a REST interface for Illumina’s bcl2fastq software. It includes a simple scheduler to efficiently manage multiple bcl2fastq instances on a single server.

arteria-checksum
Runs md5sum checking and lets you know whether all files have preserved their integrity.

Who can use it?

You can! We’ve open sourced the Arteria project under the MIT licence, and we hope that more organizations and individuals will join us in developing Arteria in the future.

Publications, Presentations, Blog posts, etc

We have a paper out describing Arteria which can be found here. https://academic.oup.com/gigascience/article/8/12/giz135/5673459

Roman Valls has written two excellent blog posts related to Arteria which can be found here:

We have been featured by StackStorm:

Arteria has been presented at Scientific conferences:

Who are we?

The Arteria project originated in the SNP&SEQ Technology Platform node of the National Genomics Infrastructure at SciLifeLab. It has since been adopted at the Clinical Genomics Uppsala at SciLifeLab, and the University of Melbourne Centre for Cancer Research