Category Archives: Best practices in genetic analysis

Runs of Homozygosity Updated

         August 12, 2014

For the SVS 8.2 release we decided to improve upon the existing ROH feature. The improvements include new parameters to define a run and a new clustering algorithm to aide in finding more stringent clusters of runs. The improvements were motivated by customer comments and a recent research paper by Zhang 2013, “cgaTOH: Extended Approach for Identifying Tracts of Homozygosity,”… Read more »

Have you ever had a bad experience with a VCF file?

         August 5, 2014

“Who has ever had a bad experience with a VCF file?” I like to ask that question to the audience when I present data analysis workshops for Golden Helix. The question invariably draws laughter as many people raise their hands in the affirmative. It seems that just about everybody who has ever worked VCF files has encountered some sort of… Read more »

The State of Variant Annotation: A Comparison of AnnoVar, snpEff and VEP

         June 25, 2014

Up until a few weeks ago, I thought variant classification was basically a solved problem. I mean, how hard can it be? We look at variants all the time and say things like, “Well that one is probably not too detrimental since it’s a 3 base insertion, but this frameshift is worth looking into.” What we fail to recognize is… Read more »

The New Human Genome Reference and Clinical Grade Annotations: It’s All About the Coordinates

         February 17, 2014

On my flight back from this year’s Molecular Tri-Conference in San Francisco, I couldn’t help but ruminate over the intriguing talks, engaging round table discussions, and fabulous dinners with fellow speakers. And I kept returning to the topic of how we aggregate, share, and update data in the interest of understanding our genomes. Of course, there were many examples of… Read more »

Public Data? What’s that good for anyway?

         February 12, 2014

Dr. Bryce Christensen recently gave a webcast on Maximizing Public Data Sources for Sequencing and GWAS Studies in which he covered options for getting GWAS and sequence information online, tips for working with these datasets and what you’ll see in terms of data quality and usefulness, how to use public data sources in conjunction with your GWAS or sequence study… Read more »

Guest Post: Finding Rare Pieces of Hay in a Haystack

         August 19, 2013

Utilizing Identical Twins Discordant for Schizophrenia to Uncover de novo Mutations We are living in exciting times – the reality of high-resolution Cand individual genome sequencing now offers renewed hope in the search for the causes of complex diseases. When this technology is combined with genetic relationships, individual sequences add unrivaled proficiency. Our lab is located in London, Ontario, Canada… Read more »

More Mixed Model Methods!

         June 6, 2013

Thanks to everyone for the great webcast yesterday. We had over 850 people register for the event and actually broke the record! Take that Bryce and Gabe! If you would like to see the recording, view it at: Mixed Models: How to Effectively Account for Inbreeding and Population Structure in GWAS. While preparing for this webcast, we chose to focus… Read more »

Upcoming webcast – Mixed Models: How to Effectively Account for Inbreeding and Population Structure in GWAS

         May 22, 2013

Presenter: Greta Linse Peterson, Senior Statistician Date: Wednesday, June 5th, 2013 Time: 12:00 pm EDT, 60 minutes Abstract Population structure and inbreeding can confound results from a standard genome-wide association test. Accounting for the random effect of relatedness can lead to lower false discovery rates and identify the causative markers without over-correcting and dampening the true signal. This presentation will… Read more »

Population Structure + Genetic Background + Environment = Mixed Model

         March 22, 2013

A few months ago, our CEO, Christophe Lambert, directed me toward an interesting commentary published in Nature Reviews Genetics by authors Bjarni J. Vilhjalmsson and Magnus Nordborg.  Population structure is frequently cited as a major source of confounding in GWAS, but the authors of the article suggest that the problems often blamed on population structure actually result from the environment… Read more »

Follow Along on an Analyst’s Journey to Filter Whole Genome Data to Four Candidate Variants in SVS

         March 14, 2013

Last week Khanh-Nhat Tran-Viet, Manager/Research Analyst II at Duke University, presented the webcast: Insights: Identification of Candidate Variants using Exome Data in Ophthalmic Genetics. (That link has the recording if you are interested in viewing.) In it, Khanh-Nhat highlighted tools available in SVS that might be under used or were recently updated. These tools were used in his last three… Read more »

GATK is a Research Tool. Clinics Beware.

         December 3, 2012

In preparation for a webcast I’ll be giving on Wednesday on my own exome, I’ve been spending more time with variant callers and the myriad of false-positives one has to wade through to get to interesting, or potentially significant, variants. So recently, I was happy to see a message in my inbox from the 23andMe exome team saying they had… Read more »

Dr. Ken Kaufman’s Webcast on Exome Sequencing Wildly Successful

         August 9, 2012

Thank you to everyone who joined us yesterday for a webcast by Dr. Ken Kaufman of Cincinnati Children’s Hospital: “Identification of Candidate Functional Polymorphism Using Trio Family Whole Exome DNA Data.” Over 750 people registered for this event and 430 attended – a new Golden Helix record! If you missed the webcast (or would like to watch it again), the… Read more »

One Track to Rule Them All: Close but not quite from the 1000 Genomes Project

         July 31, 2012

I recently curated the latest population frequency catalog from the 1000 Genomes Project onto our annotation servers, and I had very high hopes for this track. First of all, I applaud 1000 Genomes for the amount of effort they have put in to providing the community with the largest set of high-quality whole genome controls available. My high hopes are… Read more »

Why You Should Care About Segmental Duplications

         June 6, 2012

My work in the GHI analytical services department gives me the opportunity to handle data from a variety of sources.  I have learned over time that every genotyping platform has its own personality.  Every time we get data from a new chip, I tend to learn something new about the quirks of genotyping technology.  I usually discover these quirks the… Read more »

DNA Variant Analysis of Complete Genomics’ Next-Generation Sequencing Data

         August 17, 2011

As I’ve mentioned in previous blog posts, one of the great aspects of our scientific community is the sharing of public data. With a mission of providing powerful and accurate tools to researchers, we at at Golden Helix especially appreciate the value of having rich and extensive public data to test and calibrate those tools. Public data allow us to… Read more »

Best Practices for Incorporating Public Genotype Data in Your Study

         October 12, 2010

The Golden Helix sales team recently came to me for recommendations regarding best practices for incorporating public controls in SNP GWAS.  It seems that there has been a surge of questions regarding this practice over the past few weeks from our customers.  Initially, I laughed at the irony of being asked to outline the best practices for what I see… Read more »

Stop Ignoring Experimental Design (or my head will explode)

         September 29, 2010
Stop Ignoring Experimental Design (or my head will explode)

Over the past 3 years, Golden Helix has analyzed dozens of public and customer whole-genome and candidate gene datasets for a host of studies.  Though genetic research certainly has a number of complexities and challenges, the number one problem we encounter, which also has the greatest repercussions, is born of problematic experimental design. In fact, about 95% of the studies… Read more »

Enhanced ROH Analysis Improves Effectiveness to Identify Rare, Penetrant Recessive Loci

         July 22, 2010

In the paper Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia, Todd Lencz, Ph.D. introduced a new way of doing association testing using SNP microarray platforms. The method, which he termed “whole genome homozygosity association”, first identifies patterned clusters of SNPs demonstrating extended homozygosity (runs of homozygosity or “ROHs”) and then employs both genome-wide and regionally-specific statistical tests… Read more »