The Summer 2020 Release of the Human Gene Mutation Database (HGMD) Professional is available, expanding the world’s largest collection of human inherited disease mutations to 289,346 entries–that’s 6,451 more than the previous release.
For over 30 years, HGMD Professional has been used worldwide by researchers, clinicians, diagnostic laboratories and genetic counselors as an essential tool for the annotation of next-generation sequencing (NGS) data in routine clinical and translational research. Founded and maintained by the Institute of Medical Genetics at Cardiff University, HGMD Professional provides users with a unique resource containing expert-curated mutations all backed by peer-reviewed publications where there is evidence of clinical impact.
Whether searching for an overview of known mutations associated with a particular disease, interpreting clinical test results, looking for the likely causal mutation in a list of variants, or seeking to integrate mutation content into your custom NGS pipeline or data repository—HGMD is the defacto-standard repository for heritable mutations that can be adapted to a broad range of applications.
detailed mutation reports
new mutation entries in 2019 alone
summary reports listing all known
inherited disease mutations
HGMD is powered by a team of expert curators at Cardiff University. Data are collected weekly by a combination of manual and computerized search procedures. In excess of 250 journals are scanned for articles describing germline mutations causing human genetic disease. The required data are extracted from the original articles and augmented with the necessary supporting data.
The number of disease-associated germline mutations published per year has more than doubled in the past decade (Figure 1). As rare and novel genetic mutations continue to be uncovered, having access to the latest scientific evidence is critical for timely interpretations of next-generation sequencing (NGS) data.
View the complete HGMD Professional statistics here.
Read more about the importance of having access to the most up-to-date and comprehensive database for human disease mutations in our white paper.
HGMD Professional helps clinical testing labs analyze and annotate next-generation sequencing (NGS) data with current and trusted information. Unlike other mutation databases, HGMD mutations are all backed by peer-reviewed publications where there is evidence of clinical impact.
To get the most out of your HGMD Professional subscription, visit our Resources webpage for case studies, technical notes, and video tutorials
NEW! HGMD on-demand webinar
In our new on-demand webinar, you will discover the power of HGMD Professional and see why reference labs around the world, such as LabCorp and Genomics England, use HGMD Professional in their clinical test interpretation. Through a live demonstration, you will learn how to use the online interface and downloadable database, as well as how to use HGMD Professional and ANNOVAR to produce a powerful in-house variant interpretation solution.
Watch the webinar here.
An updated version of ANNOVAR is also available.
Learn more about how ANNOVAR can be used with HGMD for variant annotation. Watch a recorded webinar featuring ANNOVAR here.
The Genome Trax™ 2020.2 is now available. Updated tracks have been released with HGMD 2020.1 content for all HGMD-related tracks. Additional major updates include TRANSFAC® release 2020.2, and PROTEOME™ release 2020.2.
For labs looking to generate clinician-grade reports for germline or somatic NGS testing, QIAGEN Clinical Insight (QCI) Interpret reproducibly translates highly complex NGS data into standardized reports using current clinical evidence from the QIAGEN Knowledge Base, which consists of over 40 public and proprietary databases, including HGMD Professional.
Click here for a free demonstration of QCI Interpret.
Genetic disease is the leading cause of infant death in the United States, accounting for approximately 20% of annual infant mortality.1 Screening for genetic disease has been a long-established part of preconception and prenatal care, with a community wide screening program for Tay-Sachs disease (TSD) dating back to the 1970s; however, traditional methods of carrier screening have been offered gene-by-gene, disorder-by-disorder.
Recent developments in laboratory technologies have led to the commercial availability of expanded carrier screening (ECS) panels capable of assessing hundreds of mutations associated with genetic diseases. ECS panels have the ability to identify mutations that would otherwise not be detected. While many of the disorders on these panels are individually rare, the overall risk of having an affected offspring is 1 in 280, which is higher than the risk of having a child with a neural tube defect, for which screening is universal.2
In 2012, one of the first DNA testing and genetic counselling companies to offer ECS in the United States launched a flagship ECS panel that used next-generation sequencing (NGS) technology to assess thousands of mutations associated with more than 175 of the most relevant recessive diseases. For cancer-focused screens, the lab developed a 36 gene panel for hereditary cancer risk assessment.2
In the first three years of offering ECS, the lab screened over 400,000 individuals.3 By 2016, the lab served a network of more than 10,000 health professionals, and demand for preconception screening was soaring, owing to the increasing public awareness of the ill effects related to the transfer of genetic disease.4 Unique to the lab's ECS offering was the company’s “real-time manual curation” to support the classification of each genetic variant they encountered. Extremely thorough and highly accurate, the lab's manual literature curation enabled the company to elevate the actionable information provided to the ordering physicians and the patients they served. However, this process was labor-intensive and costly, which was ironic given the dwindling cost of DNA sequencing and the supporting technology. The question became how to scale-up without cutting corners.
Clinical decision support solutions have long been touted as the way of the future for clinical genetic testing laboratories. Combining big data analytics with advanced tools and knowledge bases, clinical decision support solutions are designed to organize, filter, and present useful information at the appropriate point in time to the person who can use it to make a decision. In 2017, the lab evaluated the use of a clinical decision support solution to help scale their genomic interpretation processes: QIAGEN Clinical Insight (QCI).*
QCI is QIAGEN’s clinical decision support solution for genetic testing laboratories. Software that reproducibly converts highly complex NGS data into clinician-ready reports, QCI is the tool through which actionable information is extracted from the sequencing results. Unlike any other clinical decision support solution on the market, QCI is largely powered by manual curation.
The knowledge base inside QCI is maintained by hundreds of Ph.D. scientists certified in clinical case curation who are committed to reading and recording all publications for a given mutation. This information is then mapped to over 2.8 million ontology classes contained within the QIAGEN Knowledge Base, providing further context by establishing relationships between variants, genes, tissue types, and pathways. When a genetic testing lab runs NGS data through QCI, the software computes the ACMG classification based on evidence curated from full-text articles, public, and private data sources. The knowledge extracted from full-text articles include observed genes, variants, function, phenotype, drug, dose, clinical cases, etc. With all this information stored in a structured knowledge base, the QIAGEN KB can quickly retrieve the relevant evidence that triggers all 28 ACMG criteria to more accurately compute an ACMG classification. Further this evidence is presented at the clinician’s fingertips for quick reference. Additionally, using natural language processing, the QIAGEN KB can auto-generate a one-sentence “finding” that is representative of the relevant evidence found in the published article.
This critical feature—automated curation of manually sourced content—saves genetic testing labs considerable time and effort when searching for variant-specific articles to satisfy the levels of evidence needed to definitively determine a classification. Especially for ECS, which is a testing practice that frequently encounters novel rare variants, the value of automation is fast becoming a necessity. To accurately and robustly appraise a novel rare variant’s pathogenicity, lab personnel must manually curate multiple lines of evidence to assess clinical significance. Therefore, if the majority of this information was autogenerated, the genomic interpretation process could be economically shortened.
The lab recognized the opportunity of integrating QCI into their curation workflow and designed a study to evaluate the concordance between the clinical evidence that QCI automatically retrieves for each observed variant classification and the clinical evidence that the lab’s curation team locates and ultimately uses in the physician reports. If the results were comparable, QCI could introduce significant time and cost savings.
The lab's manual curation workflow is outlined in Figure 1. A semi-automated process, the workflow utilizes proprietary software to initially classify variants into three categories: those with high population frequency; those that have never been reported; and those needing more information before pathogenicity can be assessed. For those remaining variants, the curation team manually searches online databases, in-house article libraries, and other available resources to find variant-specific references.
Figure 1. The lab's curation workflow
The curation workflow used to determine clinical significance of variants is summarized graphically. (a) The curation process is shown in the context of the overall laboratory workflow, in which inbound samples are eventually transformed into patient reports. (b) The curation workflow contributes lines of primary evidence that are reviewed manually, which are then combined with multiple lines of autogenerated supporting evidence to assess clinical significance.
Once evidence is collected for a variant —if any is to be found—the information is then used to assess the variant’s potential pathogenicity. As recommended by the American College of Medical Genetics (ACMG) and the Association for Molecular Pathology (AMP) published guidelines for the assessment of variants in genes associated with Mendelian diseases, the lab classifies variants following a two-step process:
First, the collected evidence is categorized into one of 28 defined criteria set forth by the ACMG-AMP guidelines and assigned a code that addresses the strength of evidence, such as population data, case-control analyses, functional data, computational predictions, allelic data, segregation studies, and de novo observations. Each code is assigned a weight (stand-alone, very strong, strong, moderate, or supporting) and direction (benign or pathogenic).
Next, the lab combines these evidence codes to arrive at one of five classifications: pathogenic (P), likely pathogenic (LP), variant of uncertain significance (VUS), likely benign (LB), or benign (B). Important in this step is the lab's ability to modify the strength of individual criteria based on expert discretion—a safeguard that goes away with computerized systems.
To determine whether QCI could provide value to the lab’s curation team, the software was tasked with pulling a bibliography for 2,324 variants that had been recently detected by the lab’s ECS and hereditary cancer risk assessment panels. For each of these variants, the curation team had been able to match at least one published article with a specific disease-gene reference. QCI’s variant bibliography was expected to present the same quantity and quality of clinical evidence.
The study found that QCI’s variant bibliography was highly concordant with lab’s manual curation efforts. Of the 2,324 unique article-variant pairs identified by the lab, QCI pulled 2,075 of the references (89.3%) and an additional 13,938 article-variant pairs not captured by the lab's curation team.
Figure 2. Overlap of bibliographic content
Figure 2 shows the overlap in content quantity between the two sources. As depicted, QCI (QIAGEN) presents significantly more data for the evaluated variants. This outcome reflects the comprehensive nature of QIAGEN’s article-centric approach, which aims to collect all publications for a given variant. While exhaustive and not always necessary, QCI’s ability to glean information from numerous sources affords the software greater accuracy in predicting variant classifications, which is seen in the second phase of the lab's evaluation.
More important than the number of bibliographic sources, accuracy of cited content ultimately dictates clinical significance. Counsyl measured the quality of QCI’s variant bibliography by looking at how the software would classify variants based on the information it pulled. What they found was a concordance of 98.8% of the pathogenic cases (Figure 2).
During the study period, a total of 682 variants were classified as pathogenic by lab’s genetic scientists. Of these, only eight would be downgraded to VUS utilizing only QCI bibliographies. Therefore, the false negative rate for using QCI’s bibliographies was ~1.2% and is expected to decrease to <1%. Further, for a sample of 50 VUS variants examined, none would change classification with additional unique references in QCI, primarily because QCI includes secondary reports and studies for other disease contexts that may be listed as 'reviewed but not curated' in their curations.
As a result of these positive findings, QCI bibliographies have been integrated into the lab’s manual curation workflow, eliminating the need for manual searches in the majority of cases. (Left: variant-specific page in QCI). After several months, a comparison of the time taken for reference searches before and after the adoption of QCI was performed (Figure 3).
Figure 3. Before and after adopting QCI
The goal of this evaluation was to assess whether utilization of QIAGEN’s variant-specific bibliographies could match the level of accuracy and quality of the lab’s more time-intensive manual article selection approach. Investigators concluded that there are clear benefits for adopting QCI for reference identification: an exceptionally high variant-specific article coverage, and significant time savings in a search process that can take up to ~45 minutes.
The results also serve to validate the efficacy of the lab’s previous article search and selection method, with the vast majority of variant classifications being unaltered by use of QIAGEN’s bibliographies. The lab now employs QCI bibliographies for every curated variant. Consequently, manual search methods are still employed at the lab, but can now be reserved for variants nearer VUS/pathogenic evidence thresholds.
QCI has already proven a valuable resource for increasing the efficiency of the lab’s in-house curation. Work is underway to additionally incorporate QIAGEN’s continually-updated bibliographies into the automated components of our variant classification workflows: the initial software-based auto-curation step for newly-identified variants, and the identification of those requiring re-curation in response to new publications becoming available. Accordingly, we expect QCI to further contribute to the lab’s continuing efforts to improve turnaround time by increasing curation efficiency while maintaining classification accuracy in patient reports.
*Data taken from a joint study conducted by Counsyl and QIAGEN: Cox et al. ClinGen 2017. Counsyl has since been acquired.
Learn more about QIAGEN Clinical Insight for here.
References
[metaslider id=19676]
Day One kicked-off with numerous informational sessions, including talks on the role of AI in clinical decision-making, the importance of standardization for reimbursement, and the tremendous potential of genomic profiling in disease prevention, diagnosis, and treatment.
Dan Richards, Vice President of Biomedical Informatics at QIAGEN, spoke about the clinician's current challenge of curating all the evidence he or she needs to confidently sign-off on variant reports before they go to the prescribing physician. QIAGEN Clinical Insight (QCI) and N-of-One were featured as solutions providing options for either in-house curation with tailored workflows or on-demand curation services.
On Tuesday morning, the conversation continued with a panel hosted by Sean Scott, Chief Business Officer of Clinical Genomics and Bioinformatics at QIAGEN, that discussed the emergence and application of real-world evidence in the clinical setting, especially in precision diagnostics and clinical trial protocol design.
The panel consisted of Raju K Pillai, MD, Hematopathologist and Molecular Pathologist at City of Hope National Medical Center, James Hadfield, Director and Principal Diagnostic Scientist at AstraZeneca, and Sheryl Krevsky Elkin, Chief Scientific Officer of N-of-One.
Also on Tuesday morning, Mary Napier, Associate Director of NGS Strategy at QIAGEN, gave a timely talk on how diagnostic labs and pharma companies can gain a comprehensive understanding of tumor mutational burden signature by implementing our new QIAseq Tumor Mutational Burden panel.
What does she mean by "comprehensive"?
Find out here!
Thank you to everyone who visited the QIAGEN booth, we truly enjoyed talking to all of you about the industry challenges, and changes you see happening now and in the future.
See you at our next event:
Advances in Genome Biology and Technology (AGBT) 2019 in Marco Island, Florida!
February 27th - March 2nd
Want to know more about our clinical solutions and real-world evidence?
OncoLand is a sophisticated oncology-focused database designed to accelerate cancer research. Integrating published research and large consortium cancer datasets, robust data visualization and discovery tools, OncoLand saves you valuable time and resources in your pursuit for actionable discoveries.
Start your free trial of OncoLand today!
In the past several months, customers have performed remarkable work with CLC Genomics Workbench, a tool that offers customizable bioinformatics solutions for genomics, transcriptomics, epigenomics, and metagenomics. Many scientists rely on this tool for genome assembly and interpretation every day and here, we look at a handful of recent publications that have been supported by CLC Genomics Workbench.
De novo transcriptome assembly and comprehensive expression profiling in Crocus sativus to gain insights into apocarotenoid biosynthesis
First author: Mukesh Jain
This paper in Nature’s Scientific Reports journal comes from scientists in New Delhi who conducted transcriptome sequencing to better understand important biological mechanisms in saffron. They identified transcription factors, differentially expressed genes, and more. The team used CLC Genomics Workbench in the transcriptome assembly and the analysis of differentially expressed genes.
Complete coding sequence of Zika virus from Martinique outbreak in 2015
First author: G. Piorkowski
Scientists in France studied a strain of the Zika virus isolated from a patient in the Caribbean. In this publication, they report the full coding sequence of the virus, an essential resource for the ongoing battle against Zika virus. They used CLC Genomics Workbench to analyze the data generated with an Ion Torrent sequencer.
Effective de novo assembly of fish genome using haploid larvae
First author: Yuki Iwasaki
Published in the journal Gene, this marine genomics study from scientists in Japan demonstrates the utility of sequencing haploid fish larvae — in this case, yellowtail fish — to obtain a diploid genome assembly. They used the Proton sequencer and compared the performance of overlap-layout-consensus (OLC) and de Bruijn graph (DBG) assemblers. CLC Genomics Workbench, which uses a DBG approach, outperformed another tool in the study.
Isolation and Characterization of a Novel Gammaherpesvirus from a Microbat Cell Line
First author: Reed S. Shabman
This study, an editor’s pick in the mSphere journal, comes from scientists at the J. Craig Venter Institute and other organizations who were analyzing the transcriptome of a microbat when they discovered RNA from a previously unidentified herpesvirus. They used CLC Genomics Workbench for assembly and finishing of the viral genome.
Use of whole-genome sequencing to trace, control and characterize the regional expansion of extended-spectrum β-lactamase producing ST15 Klebsiella pneumoniae
First author: Kai Zhou
In this Scientific Reports paper, scientists from China and the Netherlands tracked the transmission path of a Klebsiella pneumoniae strain using genome-based phylogenetic analysis. They used a mapping unit in CLC Genomics Workbench to find genes associated with antibiotic resistance and pathogen virulence, and detected SNPs by mapping isolate sequence data with the same tool.
Read more about CLC Genomics Workbench
Festival of Genomics takes place on January 19-21, 2016 in London, UK. The first day of the conference will be all about workshops and the following days you can attend presentations and more than 70 different sessions covering six streams of content, and of course do a lot of networking.
Need inspiration for your data analysis workflow?
Topic: QIAGEN Sample to Insight: An introduction to the QIAGEN Bioinformatics Portfolio, and find out how we can help you with your data analysis workflow.
When: Wednesday, Jan 20, 12:50-13:00
Where: Tech Forum Stage
Speaker: Ruth Burton, PhD
You are also very welcome to come see us at booth #52 for a chat.
We're looking forward to seeing you in London!