Lander’s Lessons Ten Years after the Human Genome Project
By Kevin Davies
November 3, 2010 | WASHINGTON, DC – If anyone was capable of distilling the lessons learned in the ten years since the first draft of the Human Genome Project (HGP) in 2000, it was Broad Institute director Eric Lander.
Opening the annual American Society of Human Genetics (ASHG) convention in Washington, D.C., Tuesday evening, Lander tried to meet the organizers’ challenge to sum up “what’s come of it?”
From a technical perspective, the HGP produced “a scaffold onto which information can be put,” said Lander, including cancer genes, epigenomics, evolutionary selection, disease association, 3-D folding maps, and much more. As for intellectual advances, Lander made a series of startling comparisons of geneticists’ knowledge around the time of the HGP in 2000 and today.
In 2000, for example, only four eukaryotic genomes (yeast fly, worm, and Arabidopsis) had been sequenced, as well as a few dozen bacteria. Today, those numbers stand at 250 eukaryotic genomes, 4,000 bacteria and viruses, metagenomic projects and many hundreds of human genomes. By the end of this year, Lander expects the Broad Institute to have generated 100,000 Gigabases (Gb) of sequence.
“The cost [of sequencing] has fallen 100,000 fold in past decade, vastly faster than Moore’s Law,” said Lander. But the question remained: “How will this get used in clinical medicine? The costs need to drop to $1,000 and then $100,” said Lander.
“I no longer think these things are crazy.”
In 2000, Lander and his HGP consortium colleagues estimated there were about 35,000 protein-coding genes, with a few classical non-coding RNAs. Repetitive DNA elements called transposons were just parasites and junk.
“Today, we know all that was completely wrong,” said Lander.
Studying patterns of evolutionary conservation in some 40 sequenced vertebrates, the human gene count is “21,000, give or take 1,000,” said Lander. “There are many fewer genes than we thought. Much more information is non-coding than we thought . . . 75% of the information that evolution cares about is non-coding information.”
The study of 29 mammalian genomes shows some 3 million conserved non-coding elements in the genome, covering about 4.7% of the genome. Some of these have regulatory functions, he said. Another exciting area was the generation of genome-wide 3-D maps, which has revealed that the genome resides in ‘open’ and ‘closed’ compartments. There was much more work to be done in the coming decade, but with new next-generation sequencing tools, “it will happen.”
Mendel Redux
In 2000, the genes for about 1,300 Mendelian genetic disorders had been identified. Today, that number is about 2,900, leaving “another 1,800 Mendelian disorders to go,” said Lander. He noted the success of some whole-genome sequencing projects in identifying rare Mendelian disease genes, although the approach was not trivial. “We all have about 150 rare coding variants,” he said, in other words glitches in about 1% of a person’s genes. Those have to be carefully vetted and filtered, but in the case of recessive genes or a small number of patients, the whole-genome approach was very powerful.
Lander also broached the progress in genome-wide association studies (GWAS) for common inherited disease, where Lander says “an entire village came together” to develop the array tools, haplotype maps, and a catalogue of more than 20 million single nucleotide polymorphisms (SNPs). “The vast majority of common variation is known,” said Lander. The numbers are 1,100 loci associated with 165 common diseases/traits. For diseases such as inflammatory bowel disease and Crohn’s disease, 70-100 loci have been mapped, a pattern that Lander showed exists for lipid disorders, type 2 diabetes, height, and many other conditions.
Lander addressed the oft-publicized disappointment and criticisms expressed by some prominent geneticists, including ASHG president-elect Mary-Claire King, in the “missing heritability” and the net value extracted from GWAS papers. One widely voiced concern is that the effect size of individual GWAS “hits” is small. “I think that’s nonsense,” said Lander. “Effect size has nothing to do with biological or medical utility.” He pointed out that a drug acting on a target can have much bigger effect that the effect of the common allele.
Some geneticists believe that the “missing heritability” so far untapped by GWAS must be explained by rare DNA variants. Not so fast, said Lander. For one thing, the proportion of heritability explained in disorders such as Crohn’s and diabetes is increasing. Population genetics theory suggests that for many common diseases, rare variants will explain less than common variants.
Lander also said that geneticists must take into account epistasis, the effects of modifier genes. Such effects cannot be found statistically in GWAS, he argued. Rather than moving from mapped loci to explaining heritability to understanding biology, Lander said we must understand biology first, and then explain the models of heritability.
Cancer Conclusion
In 2000, Lander said some 80 cancer-related genes were known. The tally is now 240 genes, with genome sequencing studies revealing mutational hotspots in colon, lung, and skin cancers with therapeutic implications. As an example, Lander said his Broad Institute colleague Todd Golub, studying multilple myeloma tumors, had discovered mutations in four well known cancer genes, but more excitingly, implicated a handful of new biological pathways, including protein synthesis and an extrinsic coagulation pathway.
The battle against cancer needed more sequencing. “We’ll need the equivalent of the 1 million genomes project. We better start thinking how to engage patients,” said Lander, suggesting social networking and other ideas had to be leveraged to get patients involved.
Lander concluded by presenting what he called “the path to the promise.” If the HGP provided the raw tools, scientists were still translating basic genome discoveries into more medically directed research. That’s how far we’ve progressed in ten years. But that still leaves the daunting tasks of clinical interventions, clinical testing, regulatory approval and widespread adoption.