Our 2 SNPs is typically dedicated to informing our customers and the community on the latest in analysis methods, best practices, and the future of the industry. But for this blog post we thought it would be nice to give you the insider’s scoop on our company with a few things you probably didn’t know about us. Continue reading
Tutorials are ever-present in the world today, and for good reason. Why struggle through a complicated process yourself, when there is already a guide established to assist? While no one would suggest that a tutorial is the only way to complete a project, it is certainly a nice starting point.
This rings true with genetic software as well. There are many ways to analyze DNA-Seq and SNP data, but a starting point is helpful. With that in mind, Golden Helix has curated tutorials to help researchers with their analysis, on levels varying from beginner to advanced. Continue reading
SVS offers options for performing many different QC functions on genomic data. This blog takes you through some of the most commonly applied filters for various analysis types.
Filters for GWAS data vary depending on the type of association tests you are performing. A typical GWAS for a common variant usually requires filters to remove problematic or poorly called variants, and also to eliminate rare variants, as they have limited statistical power. The default minor allele frequency (MAF) threshold in SVS is set at 5%, but users may often wish to use lower thresholds (1% or less), especially with larger numbers of samples. The default call rate threshold in SVS is 0.95, but might be adjusted to reflect the call rate which would be considered an outlier in your data. LD pruning to remove correlated SNPs is a good practice prior to running principal components analysis, IBD analysis, or other population-level functions that might be biased by large blocks of redundant SNPs. Most of these functions, together with many others, can be found under the Genotype menu in any SVS spreadsheet. Continue reading
We had a lot to celebrate recently. Last year was the 300th anniversary of Jacob Bernoulli’s Ars Conjectandi. In this book he consolidated central ideas in probability theory, such as the very first version of the law of large numbers. It was also the 250th anniversary of Bayes theorem named after Thomas Bayes (1701–1761), who first suggested using the theorem to update beliefs. Continue reading
I’m sitting in the Smithsonian Air and Space Museum basking in the incredible product of human innovation and the hard work of countless engineers. My volunteer tour guide started us off at the Wright brother’s fliers and made a point of saying it was only 65 years from lift off at Kitty Hawk to the landing of a man on the moon.
I was dumbstruck thinking that within a single lifetime the technology of flight could go from barely existing to space travel. Continue reading
Genomic research is exploding. There is a plethora of new methods and workflows for research and clinical use. While we are a software company at heart, we find ourselves in the role of educators. Our customer interactions are about informing, teaching, and consulting. A few years back, we started with regular webcasts that took this idea to the next level. Over time we have assembled a lot of material that is useful for people in our field-whether they use our software or not. Continue reading
Genomic prediction uses several pieces of information when calculating its results. Genetic information is used to predict the phenotype or trait for the individuals. The phenotypic trait data can be provided for a subset or for all of the individuals being studied. The genomic prediction model (a single mixed model regression equation) also uses the contribution of each genetic loci to build the model, as well as to solve for EBV and ASE. Continue reading
Over the last decade, DNA sequencing has made vast technological improvements. With the cost of sequencing decreasing significantly, sequencing technology has become a product for the masses. The sequencing technology and programs that were once used exclusively by major research institutions are now becoming available in many research facilities around the globe. These tools produce large amounts of data sets that require specialized processing before meaningful interpretation can begin. Continue reading
You probably haven’t spent much time thinking about how we represent genes in a genomic reference sequence context. And by genes, I really mean transcripts since genes are just a collection of transcripts that produce the same product.
But in fact, there is more complexity here than you ever really wanted to know about. Andrew Jesaitis covered some of this in detail as he dove deep in the analysis of variant annotation against transcripts in his recent post The State of Variant Annotation: A Comparison of AnnoVar, snpEff and VEP. Continue reading
For the SVS 8.2 release we decided to improve upon the existing ROH feature. The improvements include new parameters to define a run and a new clustering algorithm to aide in finding more stringent clusters of runs. The improvements were motivated by customer comments and a recent research paper by Zhang 2013, “cgaTOH: Extended Approach for Identifying Tracts of Homozygosity,” that outlined a new approach to identify clusters of runs. Continue reading