In the era where cloud-based solutions are the default for the modern office, it may not be obvious why many laboratories and testing centers choose to host their data and analysis pipelines on-premises or on self-managed cloud services. In the recent webcast Evaluating Cloud vs On-Premises for NGS Clinical Workflows, I explored the topic of how to make infrastructure decisions when planning to scale the volume of exomes and genomes handled by the laboratory.
The webcast explored the different types of analytical workloads required to process NGS samples, and how the bursty nature of the secondary analysis and automated portion of the tertiary analysis requires an evaluation of local on-premises and cloud work orchestration and data management scenarios.
At the end of the webcast, a number of questions were asked that we did not have time to answer.
Q: Can you handle DRAGEN output for VarSeq
Yes, VarSeq imports VCF files generated from any variant calling the secondary analysis pipeline. We specifically have tested and confirmed compatibility with the small variant and CNV calling outputs of DRAGEN. Imported CNV calls from DRAGEN can be annotated with CNV algorithms, including the CNV scoring and classification algorithm as well as be added to VSClinical.
Q: We are looking at TSO-500 for our cancer test. How long does it take to process a single sample?
The VSClinical AMP guidelines support clinical workflows for any NGS cancer panel, including the new comprehensive genomic profiling panels such as TSO-500. VSClinical allows adding not only small variants but also CNVs and fusions such as those called by these comprehensive genomic profiling tests. Our flexible report feature supports the integration of all these variant types, as well as genomic signatures such as TMB and MSI status into a single clinical report. These tests generally come with pre-built bioinformatics pipelines that are specialized to the specific kit and so the processing time for the sample is mostly defined by these pipelines. The remaining hands-on interpretation time in VSClinical is made more efficient by our automatic somatic variant classification as well as the auto-matching of interpretations at the gene and biomarker level from the Golden Helix CancerKB database.
Stay tuned for upcoming announcements of more capabilities for these tests later this year.
Q: Does this work with the Microsoft cloud?
The Microsoft Azure cloud has all the capabilities needed for scaling bursty analysis pipeline, including the ability to spin up Linux machines programmatically using a job orchestration layer. What is required when using public cloud infrastructure, whether it’s Microsoft Azure, Amazon AWS, or Google Cloud, is a security engineering focus on ensuring there is no public access to the cloud-hosted NGS data and virtual machine instances that access the data. This requires usually configuring a Virtual Private Cloud (VPC) network that makes the cloud machines appear to be connected to the local network.
If you are interested in this topic and will be attending ESHG in June, please come talk to us at our booth or attend our break-out session on a similar topic. We will of course have our famous t-shirts with us and are ready to answer all your questions regarding NGS data analysis. Come visit us at Booth 476, Floor X5, in Vienna this June; and be sure to register for our Corporate Satellite Talk.