Category Archives: Technology review

The Clinical Genome Conference 2015 Highlights

This last week I had the pleasure of attending the fourth annual Clinical Genome Conference (TCGC) in Japantown, San Francisco and kicking off the conference by teaching a short course on Personal Genomics Variant Analysis and Interpretation. Some highlights of the conference from my perspective: Talking about clinical genomics is no longer a wonder-fest of individual case studies, but a… Read more »

VarSeq: A bioinformatics Swiss Army knife

If you’ve seen the recent webinars given by Gabe Rudy and Bryce Christensen, you’ve no doubt been impressed by the capabilities of VarSeq when it comes to annotation and filtering. However, we sometimes forget that the power that enables all this complex analysis can also be used in more mundane tasks like VCF subsetting. And although these day-to-day tasks don’t… Read more »

Updates to ClinVar and dbSNPs: Fresh charts for Cromonaughts!

I’m sitting in the Smithsonian Air and Space Museum basking in the incredible product of human innovation and the hard work of countless engineers. My volunteer tour guide started us off at the Wright brother’s fliers and made a point of saying it was only 65 years from lift off at Kitty Hawk to the landing of a man on the moon…. Read more »

RefSeq Genes: Updated to NCBI Provided Alignments and Why You Care

You probably haven’t spent much time thinking about how we represent genes in a genomic reference sequence context. And by genes, I really mean transcripts since genes are just a collection of transcripts that produce the same product. But in fact, there is more complexity here than you ever really wanted to know about. Andrew Jesaitis covered some of this… Read more »

The State of Variant Annotation: A Comparison of AnnoVar, snpEff and VEP

different ways

Up until a few weeks ago, I thought variant classification was basically a solved problem. I mean, how hard can it be? We look at variants all the time and say things like, “Well that one is probably not too detrimental since it’s a 3 base insertion, but this frameshift is worth looking into.” What we fail to recognize is… Read more »

All I Want for Christmas Is a New File Format for Genomics

Tis the season of quiet, productive hours. I’ve been spending a lot of mine thinking about file formats. Actually I’ve been spending mine implementing a new one, but more on that later. File formats are amazingly important in big data science. In genomics, it is hard not to be awed by how successful the BAM file format is. I thought… Read more »

Comparing BEAGLE, IMPUTE2, and Minimac Imputation Methods for Accuracy, Computation Time, and Memory Usage

Genotype imputation is a common and useful practice that allows GWAS researchers to analyze untyped SNPs without the cost of genotyping millions of additional SNPs.  In the Services Department at Golden Helix, we often perform imputation on client data, and we have our own software preferences for a variety of reasons.  However, other imputation software packages have their own advantages… Read more »

More Mixed Model Methods!

Thanks to everyone for the great webcast yesterday. We had over 850 people register for the event and actually broke the record! Take that Bryce and Gabe! If you would like to see the recording, view it at: Mixed Models: How to Effectively Account for Inbreeding and Population Structure in GWAS. While preparing for this webcast, we chose to focus… Read more »

The Murky Waters of Variant Nomenclature – You Could Be Missing Vital Information


When researchers realized they needed a way to report genetic variants in scientific literature using a consistent format, the Human Genome Variation Society (HGVS) mutation nomenclature was developed and quickly became the standard method for describing sequence variations. Increasingly, HGVS nomenclature is being used to describe variants in genetic variant databases as well. There are some practical issues that researchers… Read more »

The State of NGS Variant Calling: DON’T PANIC!!


I’m a believer in the signal. Whole genomes and exomes have lots of signal. Man, is it cool to look at a pile-up and see a mutation as clear as day that you arrived at after filtering through hundreds of thousands or even millions of candidates. When these signals sit right in the genomic “sweet spot” of mappable regions with… Read more »