Somatic Variant Calling with Sentieon: Tumor-Only Workflow

         November 19, 2019

To recap what we have covered in this blog series thus far, Sentieon allows users to call somatic variants against a matched normal sample and a tumor-only analysis. Part I of this series covered the variant calling workflow for tumors with matching normal samples. However, a common situation would be to call variants for a tumor sample without the normal. This situation is tricky in that the normal sample provides the germline variants that would be excluded from the tumor sample. Without that normal sample, how does one best isolate and remove obvious germline variants? Thankfully, Sentieon provides a solution to this issue.

Utilization of a Tumor-Only Workflow

For those of you following this blog series, you will recognize the process in Figure 1 showing the first step of aligning reads in the FASTQ (Alignment Sorting), followed with the optional step of deduplification, which is ideal for amplicon data. After the quality control steps (Indel Realignment and Base Quality Score Recalibration), the final stage of variant calling is performed. Without having the Tumor-Normal pair, the variant calling process is optimized with a user-built Panel of Normal VCF.

Figure 1. Workflow for the Tumor-Only analysis with Sentieon with the utilization of a Panel of Normal VCF to eliminate germline variants.

In a general sense, this application of the Panel of Normal (PoN) VCF is the only difference from the Tumor-Normal workflow. The PoN is comprised of any germline sample variants selected by the user. In a simple sense, the user can build the PoN with as many germline samples as they would like to use so to filter out germline variants as much as possible. There are a few somatic callers that Sentieon provides and it is recommended to build the PoN specific to each caller (i.e. TNHaplotyper PoN is unique to the TNScope PoN).

The Sentieon script will list the file path to the designed PoN VCF after being built. Additionally, the VCF can be annotated with any other variant data which may include COSMIC or dbSNP rsIDs. Beyond the variant calling process in Sentieon, the filtering of high quality, obvious somatic, and clinically relevant (includes AMP Guideline automation) variants can be handled within Golden Helix’s tertiary analysis software, VarSeq.

In addition to supplying the solution to Tumor with and without Normal samples, users can customize their pipeline even more. If you are interested in generating your VCFs with Sentieon’s high-quality somatic callers, users have access to high-sensitivity scripts generated to increase the capture of called somatic variants, and support for GRCh38 assembly. Continue reading the final blog of this series where I discuss these two helpful features here.

