Author Archives: Gabe Rudy

Avatar

About Gabe Rudy

Meet Gabe Rudy, GHI’s Vice President of Product and Engineering and team member since 2002. Gabe thrives in the dynamic and fast-changing field of bioinformatics and genetic analysis. Leading a killer team of Computer Scientists and Statisticians in building powerful products and providing world-class support, Gabe puts his passion into enabling Golden Helix’s customers to accelerate their research. When not reading or blogging, Gabe enjoys the outdoor Montana lifestyle. But most importantly, Gabe truly loves spending time with his sons, daughter, and wife. Follow Gabe on Twitter @gabeinformatics.

  

Top 10 Posts for Understanding Clinical Annotation of Genomic Variants

Top 10

The VarSeq clinical platform is built on a strong foundation of data curation and annotation algorithms to ensure the variants identified have all the information required to make the correct clinical assessments.  It’s easy to make light of “variant annotation”, but the details run very deep into the roots of how we represent genomic data, how public data is aggregated, stored… Read more »

Understanding Your GWAS Signal with LD Scores

Genome-Wide Association Studies

When studying complex diseases that may have multi-genic contributions from across the genome, it is not uncommon to see what may appear like elevated correlation between your trait or other test variable and the SNPs across the genome. The problem is at first glance you won’t be able to tell if this is due to a population structure in your… Read more »

CNV Caller Updates and More with VarSeq 1.4.5

Genotype Imputation

We have been heads down doing the detailed and careful work to improve our CNV caller algorithm in the past three months since our we launched our Exome capable CNV caller and are very excited about the massive step forward we have made with the VarSeq 1.4.5 release. Additionally, we have added the all new Whole Genome large-event caller capable… Read more »

Annotating with gnomAD: Frequencies from 123,136 Exomes and 15,496 Genomes

annotating gnomAd

Annotating with gnomAD: Frequencies from 123,136 Exomes and 15,496 Genomes When the Broad Institute team lead by Dan MacArthur announced at ASHG 2016 that the successor to the popular ExAC project (frequencies of 61,486 exomes) was live at http://gnomad.broadinstitute.org/, I thought their servers would have a melt-down as everyone immediately jumped on and started looking up their favorite genes and… Read more »

Springtime for SVS: Updates to PhoRank, Platform Support and Genotype Imputation

VarSeq Updated

Recently, we added a natively supported Genotype Phasing and Imputation capability in SNP & Variation Suite 8.7.0. Since then we have had fantastic feedback and adoption as folks take advantage of the BEAGLE 4.0 and 4.1 algorithms from within their existing SNP GWAS and agrigenomic workflows. One piece of feedback we heard from our time at PAG, ACMG and our… Read more »

Updating VarSeq’s Transcript Annotation along with NCBI RefSeq Genes Interim Release

transcription annotation

It may be possible to say that annotating a variant correctly and accurately against gene transcripts is the most important job of a variant annotation and interpretation tool. We take it very seriously at Golden Helix as we support VarSeq and its use by our customers in both research and clinical contexts. It has been a source of frustration that… Read more »

Paying Attention to the Quality Fields in ExAC: A Case Study

In the past couple of weeks, the topic of the Filter and Quality fields in the popular ExAC population catalog has come up a number of times. It turns out that unlike the 1000 Genomes project, which decided to very heavily filter their variant list to only contain variants they consider high quality, ExAC chose to include more dubious variants… Read more »

PhoRank in SVS: Gene Ranking for Your Research Genotypes

gene ranking

Since we released our Phenotype Gene Ranking algorithm in VarSeq, it has become a staple of the way people conduct their analysis. It allows for a combination of filtering with ranking to prioritize follow-up interpretations of analysis results. Our PhoRank algorithm will be available in our upcoming SVS release to also aid in the numerous research workflows performed on SNPs… Read more »

Massive Variant Boost to ClinVar & PubMed Citation Fields

ClinVar

It may have been easy to miss in the drum-beat of monthly annotation updates we do here at Golden Helix, but there are a couple of things that are very special about the January update to the ClinVar database: We added new fields including HGVS names of variants and citations in PubMed for variants ClinVar nearly doubled in size by… Read more »

ExAC CNVs: The First Large Scale Public Exome CNV Variant Set

ExAC CNVs

ExAC CNVs were released publicly with a recent publication, providing the full set of rare CNVs called on ~60K human exomes. While there are many public CNV databases out there, this is the first one that was derived from exome data, and thus includes both extremely rare and very small CNV events. With the recent release of Golden Helix’s CNV calling… Read more »

Annotating Cancer Mutations with CIViC

      Gabe Rudy    November 15, 2016    2 Comments on Annotating Cancer Mutations with CIViC
CIViC database

While clinical assessments of germline mutations have been collected in ClinVar under the stewardship of the NCBI and the collaborate effort of many testing labs, the same type of resource has been missing for mutations that could informal clinical care in Cancer. Or at least, that is what I thought until I started to work with CIViC. With the stewardship of… Read more »

Genotype Imputation and Phasing now in SNP & Variation Suite

Genotype Imputation

One of the tools at the top of the toolbox for researchers working with microarray data is genotype imputation. Genotype imputation is the process of inferring the genotype of one or more markers based on the correlation pattern (aka linkage disequilibrium or LD) of the surrounding markers for which genotypes are known. We have now integrated a natively ported version of BEAGLE into Golden… Read more »

Why Call CNVs: Getting More from your NGS Data

CNV Call

Copy Number Variants have been important to clinical genetics for quite a while now. So, what has made now the right time to be looking at calling CNVs from NGS data? Well, there are a number of good reasons. The dominant one is simply that the NGS data you are already creating for calling variants can be used in many cases… Read more »

Let There Be Genomes: Big Data Genomics Webcast Teaser

Big Data Genomics

Big data is here, but fear not, you don’t need a Hadoop cluster to analyze your genomes or your cohorts of tens of thousands of samples! It turns out, for the kind of algorithms employed in variant annotation and filtering, running optimized local programs is often faster anyway. As we support our diverse customer base, we have definitely seen the… Read more »

Compute Kinship Matrices & GBLUP on Very Large Sample Sets

Binary Data

Now available in SVS! Increasingly important in the analysis of the genotype to phenotype relationship is accurately accounting for the relatedness of samples. This is especially important to model correctly in plant and animal populations where man-directed breeding shapes the relationship structure. Along with trait association, one of the high-value use cases for genotyping animals and plants is to estimate… Read more »

Variant Normalization: Underappreciated Critical Infrastructure

Variant Normalization

Variant Normalization: Underappreciated Critical Infrastructure It may surprise you to learn that every variant in the human genome has an infinite number of representations! Of course, although true, I’m being a bit hyperbolic to prove a point. Even seemingly simple mutations like single letter substitutions are legitimately represented differently in the local context of other mutations that can be described… Read more »

Much Love for VarSeq: ESHG 2016 Success

ESHG 2016

I’m very glad I had the chance to attend ESHG 2016 in Barcelona and talk to so many people about Golden Helix and our software at our booth. ESHG may be the little sibling in size compared to ASHG, but my impression is that it punches above its weight in terms of advancing human genetics applicability to human health and… Read more »

CADD Scores: Rank and Filter in Harmony!

VSClinical algorithm

There used to be much energy expended at conferences, bioinformatics forums and even publications about what was the better strategy for interpreting variants of clinical significance: Rule-based filtering and classification mechanisms or rank-based prioritization through all-encompassing “pathogenicity” scores. Both have shown to be effective. Rule-based systems, as exemplified in this filtering diagram in Baylor’s ground-breaking paper on clinical whole-exome sequencing… Read more »

ICGC: The Next Generation Cancer Mutation Database Now Available

ICGC’s database is now available For quite a while, COSMIC has been synonymous with the catalog of “known somatic mutations”. It is the ClinVar of cancer mutations and invests heavily in “expert curation” (having human experts read papers and pull out references to published somatic mutations). But it turns out there is a new kid on the block, and he… Read more »

Scaling is in our DNA: Making Genomics Accessible

Scalable Data

Scaling is in our DNA: Making Genomics Accessible One of the things I absolutely love about the work we do at Golden Helix is keeping up with the changes in data analysis driven by the iterative and generational leaps in technology. But one thing has always been a constant since day one: we break preconceived notions of what scale of… Read more »