Fast Annotations Around the Globe: Our Sydney and Frankfurt Data Mirrors

         December 3, 2019

If you have watched this blog over time, it would be no surprise that Golden Helix invests a lot in curating genomic annotations for use with our clinical and research analysis products. Often, we spend considerable time on the attention to detail necessary to ensure the best experience for any data source by cleaning, normalizing, documenting and then distributing it through our data annotation server. Many annotations are updated on a monthly schedule and we are always publishing newly curated versions of sources (such as gnomAD, dbSNP, and COSMIC recently). 

New Global Servers, Configuration Automatic 

With our growing presence in Europe and East Asia, we wanted to ensure a consistently fast experience for users getting started with VarSeq and VSClinical. With that in mind, we have added two mirrors for our public and licensed annotation sources in Germany and Australia as alternatives to the North Virginia primary server. 

No configuration change is required by any users, as the geographically closest server is selected automatically when VarSeq looks up the domain name data.goldenhelix.com. Any new annotation sources will be synchronized to the mirrors within 24 hours. 

In our testing and in the last couple weeks of user-reported experiences, these mirrors have made a dramatic real-world difference! We have seen the downloading of tracks like the 800MB human reference sequence drop to just a few seconds within the EU. Similarly, the bandwidth limitations between the US and Australia previously meant waiting overnight to get offline copies of tracks like CADD (54GB). Now, it can be downloaded within a quick coffee break. 

Improvements for VSClinical 

VSClinical requires some sources to be locally downloaded that are heavily used during the local analysis, while other sources that are only referenced for the current variant can be queried on demand. 

Large annotations sources can be queried on demand with VSClinical 

As VSClinical opens analysis on a variant, any On-Demand sources that have not been downloaded will be loaded just for the current variant. With these new servers, the latency for these queries drops dramatically, providing a very quick response and loading experience to the user. 

Continued Support Complete Offline Analysis in VarSeq and VSClinical 

While many users run VarSeq from internet-connected laptops, workstations, and servers, Golden Helix has been the leader in supporting complete offline analysis and interpretation workflows in the clinical testing market. This means that once our annotation sources are downloaded, not a single query to the internet is required to complete an entire FASTQ to Clinical Report workflow following the ACMG or AMP guidelines. This also means you choose when to validate a new analysis workflow that includes updates to the latest annotation sources. For our customers and institutions with the most restrictive firewalls and security measures, we can even support setting up a private mirror of our annotation server within the institutional firewall. Please reach out to your account manager if this is of interest, or email us at info@goldenhelix.com!

Updates to Server Whitelisting for Firewall Configurations 

Finally, if your institution has explicitly whitelisted the Golden Helix servers at the firewall or internet gateway level at an institution level, you may need to update the configuration with the new servers. 

Server / Location Ports IP Address for Whitelisting 
data.goldenhelix.com (US) 80, 443 (http, https) 52.206.56.216 
data.goldenhelix.com (EU) 80, 443 (http, https) 3.124.185.245 
data.goldenhelix.com (AU/Asia) 80, 443 (http, https) 3.106.142.242 
update-public.goldenhelix.com 443 (https) 35.168.118.137 

Providing the best experience to our global customer base will always be our priority, and we are happy to see our global data servers already making a difference in supporting the clinical testing work of VarSeq users everywhere. 

Leave a Reply

Your email address will not be published. Required fields are marked *