How SVS Treats Gender in Calculating Genotype Statistics

         May 9, 2012

Recently several customers have asked how SNP & Variation Suite (SVS) treats gender when calculating genotype statistics. In this blog post, I will cover SVS’ current capabilities, what we have available through Python scripts, and what is coming in the near future. We thank all of our customers who have inquired about these capabilities and have given us valuable feedback for improvements.

Currently in the software…
SVS does not adjust any statistics for the non-autosomal chromosome markers for gender, including Hardy-Weinberg Equilibrium (HWE) calculations. SVS warns against using non-autosomal chromosomes for PCA calculation and for filtering by presenting a warning message when launching these functions if the spreadsheet is marker mapped and the spreadsheet contains active, non-autosomal markers.

However, there are no such warning messages for calculating other marker statistics, such as HWE or for association tests (both Genotypic and Numeric). All markers are expected to be diploid in the software. Because this is obviously not true for sex chromosomes (even though calling algorithms represent them as diploid), we advocate filtering non-autosomal markers prior to any downstream analysis as outlined in the first step of our SNP GWAS tutorial. This includes inactivating markers from the Y chromosome (and from the X chromosome per the discretion of the user). The SVS manual also has the formulas used for all of the marker statistics and association tests if you ever need more information!

Now available through add-on scripts…
Recently, my colleague, Autumn Laughbaum, wrote a script (Recode Genotypes with X Chromosome Adjustment) to recode the genotypes into an additive model adjusting for male subjects on the X chromosome. (See her blog post about this for more information: New Features in SVS: Accounting for Sex Chromosomes and Filter Columns by Variant Type.) With the data recoded, numeric analysis methods or regression analysis can be used to analyze the data.

In the next SVS bug fix release (7.6.5)…
We will add more warning messages to remind users not to use autosomal statistics for X chromosome markers. These will be temporary messages until the next SVS feature release.

Coming soon…
This summer SVS 7.7 will include X chromosome adjustments! Fundamentally these techniques for X chromosome adjustment require two things: a gender specification for each sample and an understanding of which chromosomes are hemizygous for males. Once those pieces of information are available, the primary allele counting code can take advantage of that and all derived statistics and numeric encodings of genotypes can be properly adjusted. In SVS 7.7, we plan on utilizing the per-project genome build to discern autosomes versus sex chromosomes and thus, by default, provide X chromosome adjustments for all statistics including HWE.

Of course, adjusting for gender requires having a gender classification for samples. We have traditionally not placed restrictions on the input dataset for marker statistics and tests other than the dependent variable used for association testing. And that should remain the case in the future. However, we will do our best to detect a phenotypic column encoding Gender (and provide documentation on how to encode this column). If a gender column is not detected, we will present friendly warnings to remind the user that they are using formulas designed for autosomal diploid markers on their data.

We are very excited to have these new features in our software in a way that benefits the analysis of all species types and use cases. Although gender and sex chromosomes are well defined in mammalian species, we are also soliciting feedback on if and when such gender adjustment makes sense for any plant genome analysis (which are currently treated as autosomal).

In closing…
Of course there are other ways to analyze non-autosomal data. If any of these methods would be useful to you, let us know and we can look into adding this capability either via a Python add-on script or directly in the software.

If you have any further questions regarding how SVS treats non-autosomal chromosomes or the changes coming to SVS, please do not hesitate to contact the support team!

…And that’s my 2 SNPs.

One thought on “How SVS Treats Gender in Calculating Genotype Statistics

  1. Robert Kleta

    Yes, we surely could use that. X as important as autososomes.
    See NEJM 2011 (Stanescu H et al.), where we used SVS and could not analyse X.
    Best, Robert

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *