We had a great time showing users the new COSMIC database for future NGS cancer analyses within VarSeq. If you didn’t get a chance to join us for this live webcast, you can watch the recording below. I have summarized our live Q&A session for anyone who is curious about what others were asking.
The COSMIC track is free, what do we get with paying for a premium track?
It’s true that as an academic, you can download the raw COSMIC data as a flat file, Golden Helix does not have the same level of access. While we do seek to modestly recoup costs, our experienced data curation team has spent a long time working with this data to aggregate the raw sample-level information into normalized genomic variant representation. Three major things separate the final track available through us from the raw form:
- We aggregate and count the sample’s each variant has broken down by a number of useful dimensions.
- We in some cases work backward from the coding or protein representation when the genomic representation is insufficient.
- We perform variant normalization to prepare the data to exactly match variants called from NGS variants.
COSMIC has no fixed release schedule, but we plan to update to every new version of COSMIC shortly after it is made available. We plan to have a quick turn-around on this similar to how we update to the latest ClinVar and CIViC data every month at the beginning of the month.
Is this available for both 37 and 38 assemblies?
Yes, although COSMIC natively now records data in GRCh38 coordinates, we lift over backward to 37 and publish on both genome assemblies
How much does it cost?
The cost of any premium annotation is always in the realm of other premium annotation tracks, and of