A few months ago in Golden Helix’s 2nd Annual Abstract Challenge, Dr Raluca Mateescu tied for third place with her entry on the palatability of beef. We mentioned in our previous post highlighting all of the challenge winners that Dr. Mateescu would be presenting her work for the Golden Helix community and the time has come! Next week for our monthly webinar, Wednesday July 8th, Raluca will present “Genomic Analyses on the Palatability of Beef”. (Want to join? Register here.) As a preface to the webcast, we wanted to provide you with a short introduction to Dr. Mateescu and her research goals.
As an Associate Professor in the Department of Animal Sciences at the University of Florida, Dr. Mateescu’s research is focused on the biological traits of beef cattle, sheep and goat molecular genetics. Her over-arching goal is to unravel the genetic basis for the phenotypic variability in biological traits of economic importance that have a complex inheritance. But what does that really mean?
This last week I had the pleasure of attending the fourth annual Clinical Genome Conference (TCGC) in Japantown, San Francisco and kicking off the conference by teaching a short course on Personal Genomics Variant Analysis and Interpretation.
Some highlights of the conference from my perspective:
- Talking about clinical genomics is no longer a wonder-fest of individual case studies, but a pragmatic discussion of standards, data sharing and using the right tools for the right job.
- Early detection, prevention, and understanding wellness versus disease states can leverage genomics but also involves longitudinal measurements and many human factors.
- Some cancer types, such as non-small cell carcinomas clearly benefit from integrative analytics of multiple assays (WGS, mate-pair seq for SVs/CNVs, PCR for expression), but the complexity and cost is high. In other words, after the relative simply clinical assays of onco-gene panels to suggest targeted molecular therapies, it gets hard fast!
It is already almost halfway through 2015, and June has been especially busy as far as customer publication goes. We wanted to pass on the articles to you and congratulate our customers on their success!
Golden Helix recently announced the addition of VSPipeline to our VarSeq software. VSPipeline is a command-line interface that will allow high throughput environments the ability to tap the full power of VarSeq’s algorithms and flexible project template system from any command line context, including existing bioinformatics pipeline.
VSPipeline supports the need to efficiently generate VarSeq projects from workflow-encoding project templates. Out of these automated pipelines will come fully produced VarSeq projects, ready for the technical and medical staff to jump into variant interpretation and reporting. Bioinformatics core labs can also leverage the fast and flexible annotation algorithms in VarSeq and Golden Helix’s unmatched and up-to-date public annotation repository with VSPipeline.
We are very excited about this addition to VarSeq and the new capabilities it provides our users. Read the full press release here.
Just a few weeks ago we announced our partnership with MedGenome. The news was covered by a number of outlets including:
Let me expIain the importance and impact of this announcement.
Since Varseq was released, we have received strong interest from testing labs that are leveraging our product to implement cancer diagnostic pipelines. Please feel free to take a look at my e-Book, Genetic Testing for Cancer, on this subject. Now, at Golden Helix we specialize in state of the art filtering and annotation capabilities. Our product ships with annotation sources such as COSMIC, which is a terrific first baseline.
The support team at Golden Helix is always on-hand to help with your SVS and VarSeq needs. We get some questions more often than others, and this blog will answer some of the most common questions we’ve been seeing lately regarding VarSeq.
A common question we receive is if data can be filtered from a locally kept set of variants or genomic features in a BED file. The answer is – yes you can! The file doesn’t even need to be converted through the Data Source Library Convert Wizard. To set up a BED file, the minimum information needed in the file is the chromosome, and start and stop position information; the example below also includes a description of that particular genomic feature, Figure 1.
In recent months we have been updating our public annotation library to include the most recent versions of existing sources as well as include new sources. Each of these annotation sources are compatible with our three major products (SVS, GenomeBrowse and VarSeq) and can be used for visualization, annotation and filtering.
NHLBI ESP6500SI-V2-SSA137 Exomes Variant Frequencies 0.0.30, GHI
Annotations are available for both GRCh_37 and GRCh_38 human builds. The current EVS data release (ESP6500SI-V2) is taken from 6503 samples drawn from multiple ESP cohorts and represents all of the ESP exome variant data including both SNPs and Indels.
1kG Phase 3 – Variant Frequencies 5, GHI and 1kG Phase 3 – CNVs and Large Variants 5, GHI
The variant frequency annotation source provides the catalog of single nucleotide variants (SNVs) “sites” called by the 1000 Genomes project for 2504 individuals from the 2013-05-02 sequence and alignment release that is mapped to GRCh_37. The CNVs and Large Variants source is a subset of only those with length greater than 200 base pairs.
Yes, I said it. “Them be fighting words” you may say.
Well, it’s worth putting a stake in the ground when you have worked hard to have a claim worth staking.
We have explored the landscape, surveyed the ravines and dangerous cliffs, laboriously removed the boulders and even dynamited a few tree stumps.
Ok, so now I’m going to back off a bit.
ANNOVAR, snpEff and VEP are broadly adopted toolsets with very friendly and responsive authors that engage their communities.
They are also solving a very narrow problem: annotating variant sites.
There are many approaches that one might use to define a variant as potentially deleterious. For example, we often see analysis workflows based on rare, non-synonymous variants, perhaps incorporating additional annotation sources that capture known or predicted consequences of coding variants. Annotations for coding regions of the genome are relatively abundant and familiar to genome scientists. We are comfortable in our ability to interpret SIFT scores, for example. Despite our collective comfort with exome analysis, there is no question that many of the secrets of the genome are found elsewhere. VarSeq users often ask about annotation sources to assist in the interpretation of non-coding regions. dbscSNV is a useful annotation source that may help you start to move beyond the coding exome by considering some of its nearest neighbors: splice site variants.
This year the MAGES symposium has a new name, the Symposium on Advances in Genomics, Epidemiology and Statistics or SAGES! The NIH has also been added to the sponsor list and we’re excited to have the support for this informative symposium! Additionally, we were treated to the addition of a new poster session to accompany the fantastic speakers.
The trend for this year’s speakers was data integration; pulling various data types together into methods and interactions to try to understand the underlying biology of disease and discover previously unidentified associations. Here is a snapshot of the methods and speakers that caught my attention.