Getting the most out of Sentieon

         September 26, 2019

Customers are always asking for ways to improve their experience with Sentieon, our partner’s secondary analysis tools that process genomics data with high computing efficiency, fast turnaround time, exceptional accuracy and 100% consistency. We have a few tips to get the most out of Sentieon. In this article I will be going over:

  1. Basic system requirements
  2. Custom scripts vs default scripts

Before you set up your environment you will first need to have the minimum hardware and software required to run Sentieon. Hardware is a very important aspect for running your pipeline within reasonable amounts of time. An improperly configured server could cost you hours, in turn, costing you time and money.

System Minimums

Hardware requirements:

  • CPU: 8+ cores
  • RAM: 16GB for small panels and 64GB for whole-genome

Operating System requirements:

Linux:

  • RedHat/CentOS 7.x
  • Debian 7.7
  • OpenSUSE-13.2
  • Ubuntu-18.04.
  • Or higher versions of the above

Scripts and their capabilities

Now that we got the hard part out of the way, let’s talk about what we can do with the scripts of the Sentieon Suite of tools. Using the default variant calling script works great, but only allows for one sample at a time and uses some of the tools defaults for system resources.

So let’s say you have well over 100 samples in a repository this would require you to do each sample individually. We have written a script to save you time. This “batch” script will recursively go through each sample in a given repository and produce the needed BAM, VCF and metrics outputs in their own sample output folder of your desire.

Another parameter worth considering is how much memory to use during alignment. This script also allows you to define how much memory you would like the BWA-MEM process to utilize (20GB-24GB is the default) but you can have it use more if your machine can facilitate.

For example:
Your server has 196GB of ram, by adding “export bwt_max_mem=120G” before calling the BWA command will allow the BWA process 120GB of ram allocation instead of the default 20GB-24GB.

These tweaks also allow you to start the script and leave it alone until it has processed all your samples and with the utilization of increased RAM the decrease in required processing times.

We have implemented these features in the following sample types:

  • Non-amplicon
  • Amplicon
  • Tumor/Normal
  • Tumor Only
  • Germline

If you are interested more in these types of features or customization in your current environment to better fit your workflows please reach out to our support team here at Golden Helix, Inc.

Leave a Reply

Your email address will not be published. Required fields are marked *