Heidi Rehm on Variant Validation, Gene Validation, and Data Sharing

April 12, 2016

By Jennifer Kennedy 

April 12, 2016 | Making the case for industry-led standards and more open and transparent data sharing at the opening plenary session at last week’s Bio-IT World Conference and Expo in Boston, Heidi Rehm, Ph.D., Director at Partners HealthCare Laboratory for Molecular Medicine and medical director at the Broad Institute Clinical Research Sequencing Platform, discussed the growing needs for improved clarity and consistency of genetic data for clinical and research purposes.

“In order to improve our knowledge of DNA variation and consistency in variant classification, it will require a massive effort in data sharing,” Rehm said. Likening the idea to crowdsourcing—or building on a community-driven approach—Rehm presented a summary of national and international efforts to develop better standards for increased data sharing.

In genetic testing for diagnostics, a chief challenge is the process of identifying variants, in particular those with unknown significance. ClinVar is the variant database of ClinGen that aggregates interpretations of variants in one place, from researchers, clinics, labs, expert groups, patients, and direct links with other database systems.

In ClinVar today, only 11% of variants have two or more submitters; of those, 17% are interpreted differently. Rehm noted the differences in interpretation can be medically significant, meaning a patient’s treatment may be different based on the interpretation. Without more data sharing in a consistent, rigorous, and open manner, “there are very serious consequences,” Rehm said. “The world is watching and patients’ lives are at stake.”

Citing journal articles, a recent lawsuit related to the appropriate interpretation of genetic variance, and potential for regulation by the U.S. Food and Drug Administration (FDA), Rehm stressed that the genomics community needs to come together and lead the way to develop its own standards to ensure safe and effective use of genetic and genomic medicine. But, she cautioned, it will require a lot of work.

One such project making headway is ClinGen, which is creating standards within the interpretive process for genetics and genomics. Rehm said it starts with sharing data; asking critical questions about the data and if the information is actionable for the patient; and then documenting the answers in curated genomic knowledge bases, like ClinVar. “It’s creating transparency about the way we’re interpreting variants and the evidence we’re using to do so,” Rehm said. Tools like this—and the Variant Explorer website, which organizes conflicting variant interpretations—help differentiate who said what about a variant, show the dates assertions that were made, and identify evidence the lab used to justify its classification.

Rehm also outlined initial steps taken by The American College of Medical Genetics and Genomics (ACMG) to establish standard language and guidelines for variant interpretations. Through a series of workgroups with members of the community, ACMG identified five terms to simplify classification of variant genes found through sequencing: Pathogenic; Likely pathogenic; Uncertain significance; Likely benign; and Benign. With these, a set of rules were also established to help combine criteria and weigh evidence for each classification.

Putting the guideline to the test, Rehm described a “Variant Bakeoff,” done as part of the Clinical Sequencing Exploratory Research (CSER) consortium of projects, where multiple labs interpreted the same set of variants using their own lab rules and the ACMG rules. While results did not prove the ACMG rules created consistent interpretations, the rules did demonstrate a valuable framework to build discussion and agreement upon. We learned, Rehm said, that “most differences [in variant classifications] can actually be resolvable with consensus and data sharing.”

Broad data sharing is becoming more common and enabling increasing success in genomics for clinical application of genetic sequencing. Rehm touched on multiple initiatives in progress:

  • ClinGen Variant Curation Tool will launch this year to aid the community in applying a more standardized and objective interpretive process to help resolve discrepancies in ClinVar.
  • ClinVar Star system helps build a primary source of evidence, with submissions reviewed and deemed at the 5-star “practice guideline” level or the 4-star “expert” level.
  • Global Alliance for Genomics and Health (GA4GH), which includes almost 400 members from 38 countries, develops standards for data sharing, convenes stakeholders, acts as a clearinghouse, and hosts working groups and demonstration projects to promote responsible data sharing.
  • BRCA Challenge and Research Portal is a GA4GH demonstration project to form a framework for international-level data sharing and collaboration using a gene with community-wide interest.
  • ClinGen Gene-Disease Validity Classification is a tiered standard used to systematically apply and classify gene-level evidence to different disease areas to guide the content of clinical tests.
  • Matchmaker Exchange is another GA4GH demonstration project building evidence for rare disease gene classification.


Rehm concluded by emphasizing that while participation in any data sharing effort is still largely voluntary, the community must lead the way—but that multiple stakeholders have important roles to play. Pointing to research organizations; lab accreditation organizations; hospitals, providers, and insurers; and the FDA, Rehm underscored that the broad data sharing benefits the community as a whole and is necessary to move the field forward.