When the human genome was sequenced a decade ago, scientists hailed the feat as a technical tour de force -- but they also knew it was just a start. Our DNA blueprint was finally laid bare, but no one knew what it all meant.
Now an international team – including researchers from labs at Yale University in New Haven -- has taken the crucial next step by delivering the first in-depth report on what the endless loops and lengths of DNA inside our cells are up to.
"It is like opening a wiring closet and seeing a hairball of wires," said Mark Gerstein, an Encode researcher from Yale told the New York Times. "We tried to unravel this hairball and make it interpretable."
The findings, reported in a slew of papers Wednesday in the journals Nature, Science and other publications, move far beyond a straightforward list of genes. They tally, in a super-complicated catalog, all the places along our DNA strands that are biochemically active -- sites where proteins attach to DNA to control it, or where enzymes move in and make little alterations, and more besides.
Defining this hive of activity is essential, scientists said, because it transforms our picture of the human blueprint from a static list of 3 billion pairs of DNA building blocks into the dynamic master-regulator that it is.
The revelations will be key to understanding how genes are precisely controlled so that they leap into action at the right place and time in our bodies, allowing a whole, healthy human being to develop from a single fertilized egg. In addition, they will help explain how the carefully choreographed process can go awry, causing birth defects, diseases and aging.
"The human genome was a bit like getting 'War and Peace' in Russian: It's a great book containing all of human experience, but (if) I don't know any Russian it's very hard to read," said Ewan Birney, a computational biologist at the European Bioinformatics Institute in England and lead analysis coordinator for the project, which is known as ENCODE. The aim, he said, "is to take the human genome and try to make a usable translation."
The $123 million effort involved more than 400 scientists and more than 1,600 experiments during five years of work. The goal was to take the babel produced by the human genome project - the sequence of 3.2 billion chemical "bases" or "letters" that constitute the human genome - and make sense of it.
"We understood the meaning of only a small percentage of the genome's letters," said Dr. Eric Green, director of the National Human Genome Research Institute, which paid for the bulk of the study.
The best-known elements in the genome are the 21,000 or so genes that specify what proteins a cell makes. The dopamine gene makes dopamine in brain cells, for instance, and the insulin gene makes insulin in the pancreas.
Only about 1 percent of the genome codes for proteins, however, and the challenge has been to figure out the function of the other 99 percent, which for years was termed "junk DNA" because it did not code for proteins.
The ENCODE scientists are biology's version of the Occupy movement, said Yale's Gerstein, of the xxx lab, who led one of the ENCODE teams: "For years everyone focused on the 1 percent. ENCODE looks at the 99 percent."
In examining the overlooked part of the genome, the ENCODE scientists discovered that about 80 percent of the DNA once dismissed as junk performs a biological function. Primarily, the not-so-junky DNA constitutes the most sophisticated control panel this side of NASA's, with some 4 million bits of DNA controlling all the rest.
"The 'junk' DNA, the 99 percent, is actually in charge of running the genes," said Gerstein.
That's because "transcription factors" and other regulatory elements - proteins made by this controlling DNA - hopscotch across each cell's double helix, binding to it in a way that turns genes on and off or up and down like a toddler who has just discovered light switches and dimmer dials.
"We draw DNA out as this long, linear thing where you can read from one end to the other, but the reality in the cell is that molecule is folded tightly and compactly and jammed into the nucleus of the cell," said molecular geneticist Joseph Ecker of the Salk Institute for Biological Studies in La Jolla, who was not involved in ENCODE but wrote a commentary accompanying the report. When our DNA is crunched up that way, like a hairball, places far apart on a strand could end up very close to each other in physical space.
While exciting, ENCODE is still just a start, emphasized Dr. Eric Green, director of the National Human Genome Research Institute, which funded ENCODE. He likened the findings to "grainy images being beamed back to Earth by the first satellite."
Reporting by Sharon Begley of Reuters news service and Los Angeles Times reporter Rosie Mestel.
Nature is making all of the ENCODE research freely available, at http://www.nature.com/encode and through an iPad app.