VS-CNV Command-Line CNV Tool

         December 9, 2021

If you stay current on the developments of Golden Helix features, you are aware of the substantial evolution of our copy number detection and evaluation capabilities in VarSeq. The process of CNV detection and evaluation is typically handled through the VarSeq graphic user interface. However, in some cases, users benefit from running this process via the command-line interface. Fortunately, Golden Helix seeks to support our users in both realms and this is the case with our VS-CNV command-line tool shipped with the VarSeq software. The purpose of this blog is to expose our users to the VS-CNV command-line tool so as to provide an alternative CNV calling approach.

First step: Accessing the tools

Once you’ve installed the VarSeq software, You’ll see the VS-CNV tool in the installation folder. Figure 1 shows the option to navigate to this folder via terminal or from the user interface in Figure 2.

VarSeq installation folder accessed via terminal in MobaXterm for a linux environment.
Figure 1. VarSeq installation folder accessed via terminal in MobaXterm for a linux environment.
Accessing the VarSeq installation folder directly from the graphic user interface of the software.
Figure 2. Accessing the VarSeq installation folder directly from the graphic user interface of the software.

Second Step: Building the CNV Reference Samples

Entering commands with no arguments presents the user with a help message. In Figure 3, you’ll see the command entered at the top to run VS-CNV, define the user login credentials, and the command to build the CNV reference set. The fundamentals of the CNV approach in VarSeq is to use a collection of reference sample coverage data to create a normalized diploid coverage profile for all samples in your cohort population.

Help message when running a command with no argument in VSCNV
Figure 3. Help message when running a command with no argument in VS-CNV

Figure 4 shows an example run of the reference sample coverage calculation defining the path to the CNV reference folder, the path to the reference BAMs and the bed file defining all target regions (exons) for the genes in this TruSight panel.

VSCNV commands to build the reference samples.
Figure 4. VS-CNV commands to build the reference samples.

Once the command is complete, you can then browse to the CNV reference folder to see the coverage calculations for each sample stored in TSF files. These references need to be calculated prior to any CNV calls for individual samples which is our next step.

The completed CNV reference sample calculations stored as TSF files for CNV detection.
Figure 5. The completed CNV reference sample calculations are stored as TSF files for CNV detection.

Third Step: Calling CNVs

Now that our reference sample coverage statistics are completed, we can carry out the CNV detection commands. Figure 6 shows the command “target” used to detect CNVs for targeted region coverage for both the panel and exome CNV methods. The final output is a TSV file containing the CNV calls for sample 1.

Commands to run the CNV caller algorithm.
Figure 6. Commands to run the CNV caller algorithm.

As the CNV caller progresses, users will see the process unfold in the terminal which can be seen in Figure 7. The final product is seen at the bottom of the output where the sample TSV file is written. You can see the example CNV calls for this TruSight panel in Figure 8.

CNV algorithm progress represented in the terminal.
Figure 7. CNV algorithm progress represented in the terminal.
CNV output presented in the VSCode text editor.
Figure 8. CNV output presented in the VSCode text editor.

The example shown above was a simple approach to running CNVs for a smaller panel. Many of our users may also wish to run this process with their exome or whole-genome data. Below is a screenshot of additional CNV calling steps for including LOH calls to exclude non-normal coverage regions and the binned approach for the whole genome. All of the described steps listed in this blog can be referenced here.

Additional features for exome and genome-level CNV calling with VSCNV.
Figure 9. Additional features for exome and genome-level CNV calling with VS-CNV.

If you would like to learn more about this command-line approach of CNV calling or alternatively deploy this process through the graphic user interface, please reach out to support@goldenhelix.com for a training session. If you are interested in scheduling a call to find which Golden Helix product would best suit your needs, please email us at info@goldenhelix.com.

Leave a Reply

Your email address will not be published.