Springtime for SVS: Updates to PhoRank, Platform Support and Genotype Imputation

         April 11, 2017

Recently, we added a natively supported Genotype Phasing and Imputation capability in SNP & Variation Suite 8.7.0. Since then we have had fantastic feedback and adoption as folks take advantage of the BEAGLE 4.0 and 4.1 algorithms from within their existing SNP GWAS and agrigenomic workflows.

One piece of feedback we heard from our time at PAG, ACMG and our ongoing conversations with customers using this new capability is that when available, it would be helpful to have the imputation algorithm be aware of the known family relationship structure of the samples (i.e. the pedigree table information).

We have been hard at work to add this capability to our natively ported BEAGLE algorithm and we are happy to announce this will be available in SVS 8.7.1 coming out this week!

We will follow up with a detailed blog post describing the methodology of pedigree aware genotype imputation, the novelty of the approach we implemented and what the output can be expected to look like with this feature.

Riding the Python Data Science Wave

One infrastructure choice we made early in the SVS platform maturation was to leverage the fantastic support of the Python ecosystem for doing scientific computing and data science.

SVS 8.7.1 represents a massive update to our underlying library infrastructure, including updating to the latest scientific python packages numpy, scipy, matplotlib and pandas.

While the user experience and analysis results will remain the same, these libraries power some of our advanced analysis features, as well as many of the scripts in our Add-On Scripts Repository freely available for SVS users.

While the results will not change, we do expect some noticeable speed improvements for these algorithms, especially for Linux and Mac users!

The linear algebra and matrix operations used in the following features were multi-thread enabled, but only Windows had the libraries to run in that mode. With this release, Linux and Mac users will notice these operations taking full advantage of all available physical cores on the host machine!

  • Mixed Linear Model Analysis
  • Genomic BLUP (GBLUP) computation, including usage from genomic prediction and K-Fold Cross Validation

Updates to PhoRank, Added OMIM and Transcript Annotations Fields

Also in this SVS release will be a significant update to the PhoRank gene ranking algorithm. The algorithm has been improved to improve the differentiation between highly relevant genes and genes that are connected to the input phenotypes through very common “supernodes” in the gene pathway networks such as cell membrane mechanisms, metabolic process pathways etc). We will follow up with a blog post describing these changes in more detail.

In addition, there is a new ability to add OMIM Phenotype terms to the existing Human Phenotype Ontology (HPO) when inputting phenotypes for the PhoRank algorithm. This is especially useful for more syndromic phenotypes that exist in OMIM, but not in HPO. The premium OMIM annotation must be added to your SVS license to enable this extra feature.

Finally, as I noted in our recent VarSeq release announcement blog post, we updated our transcript annotations algorithm recently as well as the default RefSeq genes annotation track. Both changes are now reflected in this upcoming SVS release.

Leave a Reply

Your email address will not be published. Required fields are marked *