New Technical Support Bulletins (Forthcoming!), and a Known Bug with Filter Samples by Call Rate

As Product Quality Manager, I have been spending quite a bit of time lately thinking about the best way to communicate with our Golden Helix customers with regards to product quality and customer support issues. Most of our customers at this point have already seen several emails from me this year! Email is certainly one way to get an announcement out to our customer base in the fastest way possible. However, an email may not always be required, and I certainly do not want to start filling up inboxes with content that may not be immediately relevant. I don’t know about you, but the more emails I get from an organization, the more I start to ignore them.

With this in mind, I will be using this blog as well as a new section of our website that is currently under construction for technical support bulletins. The new tech support bulletins will provide additional information that (a) is not urgent, and (b) regards new features designed to make workflows easier or more accurate. These bulletins will be designed so that it will be clear for each user what information is applicable to their software version.

Using tech support bulletins, Golden Helix can also communicate more freely about our efforts to thwart known bugs and provide bug fixes. We can also post reviews of new features, add-on scripts, marker maps, and annotation tracks that we do not want our customers to miss out on.

So, when are emails warranted? Definitely for known bugs that have the potential to produce incorrect results and those that render the software unusable. It is also important to send out emails with release notes for each new version of the software.

Why am I blogging about this now, when the details regarding the support bulletins have not been finalized? Because there is information I need to get out that does not meet the criteria for an email blast to our customers. Namely, there is a known bug with a script that was featured in one of our most popular tutorials, the SNP Analysis tutorial up until June 27, 2012. This is a “failure to run” bug, not one of the more serious incorrect results bugs that we hate so much! Details below…

Golden Helix will continue to send out emails when we discover a serious (results-changing) bug, but I hope that you like this format for communicating the merely pesky bugs and their fixes as well as exciting new features. As always, feedback is welcome at support@goldenhelix.com.


Known Bug: Filter Samples by Call Rate fails to do anything at all
Version(s) Affected: SVS 7.6.5 (all platforms)
Anticipated Release Date for Fix: Mid-July

We will have this bug fixed in the SVS 7.6.7 release. In the meantime you can do one of two things: 1) Follow the steps listed below to get around this problem, or 2) Request a copy of the fixed script to get you by until mid-July by emailing support@goldenhelix.com.

Workaround:

Instead of using Quality Assurance > Genotype > Filter Samples by Call Rate to select samples that meet a specified call rate, use the workflow below. In this workflow the goal will be to keep only the samples with a call rate >= 0.95 as in the SNP Analysis tutorial:

  1. Go to Quality Assurance > Genotype > Genotype Statistics by Sample and click OK.
  2. From the Statistics by Sample spreadsheet, click on the Call Rate column and select Activate by Threshold. Choose the inequality to be >= and set the threshold to be 0.95, then click OK.
  3. From the genotype spreadsheet go to Select > Activate or Inactivate by Second Spreadsheet and keep all defaults and select the Statistics by Sample spreadsheet and click OK.

What’s Up?
Why did this bug happen? Well, that is a long story, but the short version is because we drastically changed what output you can get when you run Genotype Statistics by Sample (a feature that is used by the Filter Samples by Call Rate function).

What changed? Now with Genotype Statistics by Sample (in SVS 7.6.5), in addition to Call Rate and Hardy-Weinberg Thw P-values (for autosomal chromosomes) that you could get before, you can also get:

  • Output on a per group basis if a categorical or binary dependent variable is selected
  • Heterozygosity Rates (overall, per chromosome and per group)
  • Variant Statistics if there is an applied marker map that contains a “Reference” field
    • Number of variant (non-reference) genotypes
    • Number of singletons (variant genotype present only in given sample)
    • Mean Transition/Transversion Ratio of variant genotypes
  • X Chromosome Statistics:
    • Gender inference and X statistics
  • Verbose output:
    • Count and variant statistics for each autosomal chromosome

This overhaul of sample statistics combined the original Genotype Statistics by Sample feature, the Autosomal Heterozygosity script, the X Chromosome Gender Inference script as well as added Variant Statistics! Talk about a one-stop sample statistics shop! I really can’t describe these features better than what is in our manual so I’ll just point you there for more information: Genotype Statistics by Sample.

Check out these useful, new statistical tools, and please suggest any other new sample statistics that you need for your research that you would like to see incorporated into SVS!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.