Scientists have learned that dna can tell us a great deal about our risk factors for certain diseases. Since the sequencing of the genome was completed in 2003, they have been able to figure out which mutations will make us vulnerable to, say, cancer or hair loss. As computing power has increased, they have been able to sequence the 19,599 genes in individual patients to provide a DNA map that helps drug companies target specific mutations. Herceptin, for example, targets patients who have what are known as HER2-positive genes, which increase the output of growth factors that drive breast cancer.
Although genetic information continues to improve medicine, there is data about the human body that goes one step beyond the genome. Genes by themselves don’t directly make us who we are. Instead, they produce proteins, which are dispatched into the body to execute the genetic will. If the genes are the blueprints, the proteins are the working parts, controlling every cell in your body. And just as the genes collectively make up the genome and have given rise to the science of genomics, so too do all your body’s proteins make up your proteome, which has its corresponding discipline: proteomics.
It’s a far more complex field than genomics, studying how proteins are structured and expressed, how they change and communicate. When you tie genome sequencing to proteome sequencing, it adds billions of data points across millions of patients. That’s both good and bad.
With a fire hose of information that big, you can develop better drugs and look for better biomarkers: anything in a patient’s blood, urine or saliva—from proteins to enzymes to red-cell count—that indicates the presence of a disease. But fire hoses are hard to handle, and that’s where Big Data comes in.
The combination of massive computer power and sophisticated algorithms that can manage staggeringly complex problems—from predicting precisely where a tornado will touch down to making your Web search more efficient—is the next great wave of data processing. Companies like Roche, Illumina, Life Technologies, Pronota and Proteome Sciences are expanding their bioinformatics platforms to develop new diagnostics and new drugs based on them. Sometimes a diagnostic and a drug are developed in tandem, a model known as Dx/Rx. These new proteomic-derived agents are designed to target everything from sepsis to Alzheimer’s disease to cancer and offer the opportunity to deliver bespoke medicine, tailored to your molecular structure. It’s Savile Row biology.
(MORE: Breakthrough Stories)
Proteomics is, in some ways, a massive pattern-matching process. It works like this: Take 100 people who have lung cancer and 100 people who don’t. What is the difference in their genomic and proteomic profiles? Identify the specific proteins that signal the cancer cells to grow and those pathways can be switched off with targeted drugs. If you are one of the unlucky 100 who have lung cancer, this kind of Big Data crunching can let doctors search proteomic data, compare it against your genome, which has all your personal mutations, and create a treatment map. The same thing will go for each of the other 99, all of whom have the same disease as you but all of whom might have arrived there by a slightly different genomic and proteomic route. “This global profiling of signal pathways will transform how we deal with cancer,” says Ian Pike, chief operating officer of Proteome Sciences, a 20-year-old company based in Cobham, England.
But that kind of data crunching plays out on a scale that makes the genome project seem like a math quiz. “What people haven’t appreciated is that the genome is not so dynamic. It tells you your likelihood of getting disease, not whether you actually have it,” says Christopher Pearce, Proteome Sciences’ CEO. You are born with one set of genes, in other words. Proteins are in a constant state of flux.
Laying Down a Marker
The field in proteomics currently attracting a lot of investment is biomarkers, which can predict with greater accuracy who is susceptible to a particular disease, help doctors diagnose and treat it earlier and track whether those treatments are working. “The next 10 years will dwarf the previous 60” in terms of what advanced sequencing can produce, says Ronnie Andrews, head of medical sciences at Life Technologies, which designs bioinformatics software platforms.
The market for biomarkers alone was about $13.5 billion in 2010, according to BCC Research, and could surpass $33 billion by 2015. Life Technologies, located in Carlsbad, Calif., is a $3.8 billion company that supplies scientists with instrumentation for gene synthesis, cell lines and more for use in genomic medicine and molecular diagnostics. Its proteomics portfolio helped make it attractive to Thermo Fisher Scientific, which is acquiring the company for $13.6 billion.