
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Spotlight on Molecular Profiling
Spotlight on molecular profiling: "Integromic" analysis of the NCI-60 cancer cell lines
Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
Requests for reprints: John N. Weinstein, Laboratory of Molecular Pharmacology, National Cancer Institute, 37 Convent Drive, Room 5056B, Bethesda, MD 20892. Phone: 301-496-9571. E-mail: weinstein{at}dtpax2.ncifcrf.gov
"Our horizon is never quite at our elbows"
Henry David Thoreau, Walden (1854)
In this issue, Molecular Cancer Therapeutics launches a brave new experiment in the publication of pharmacogenomic and pharmacoproteomic information: a series of invited, refereed articles justified by the broad interest and utility of the molecular profile databases they present, rather than by the testing of a particular biological or pharmacological hypothesis. The initial articles under the rubric "Spotlight on Molecular Profiling" focus on molecular profiling of the 60 human cancer cell lines (the NCI-60) used by the National Cancer Institute's Developmental Therapeutics Program (DTP) to screen >100,000 chemically defined compounds and natural product extracts since 1990 (14). In statistical and machine-learning analyses, the screening data have proved rich in information about drug mechanisms of action and resistance (58). The NCI-60 panel already constitutes by far the most comprehensively profiled set of cells in existence (4, 9), and much more molecular profile information on them is coming. The data have already yielded considerable biological and biomedical insight, but we have only scratched the surface thus far. The real value is realized when biomedical scientists with particular domain expertise are able to integrate and use the information fluently for hypothesis generation, hypothesis-testing, and what I would term "hypothesis-enrichment." Given the large drug activity database, the NCI-60 cell line panel provides a unique opportunity for the enrichment of pharmacologic hypotheses and for advances toward the oft-cited goal of personalized medicine.
Why is there a need for a series of article like this? The broad, generic answer is clear. For almost half a century after Watson and Crick's brainstorm, the dominant paradigm of what might be called the "pregenomic era" was hypothesis-driven, R01-funded research focused on particular molecules or processes. That paradigm served us well. But now, thanks largely to technological advances in the "post-genomic era", we have access to information on 20,000 to 25,000 genes, >100,000 splice variants of those genes, an unknown number of regulatory RNAs, and perhaps a million protein states of possible functional significance if one counts posttranslation modifications such as the phosphorylations central to cell signaling. Then there are the many types of molecules that make up what have been termed the lipidome, glycome, metabolome, epigenome, immunome, and so forth. That multiplicity constitutes a challenge and an opportunity. To meet the challenge and take advantage of the opportunity, it will be necessary to create and exploit synergies between hypothesis-driven and "omic" modes of research (10, 11). Those synergies will be particularly important as researchers try to understand system level interactions among the molecules of what Eric Lander aptly calls "the parts list" of the cell.
Those who generate omic data (10) share a common experience: many scientists want access to the data but few are ready to pay for them in academic coin of the realm. That's a major public loss. For example, after a microarray study is done, it typically takes months to find a hypothesis-driven "story" in the data to justify publication and many more months to flesh out and validate the story with functional experiments. One consequence is delay in public availability of the data. Another is that the tail ends up wagging the dog; the data are given short shrift, and the article focuses on downstream hypothesis testing (11).
That process runs counter to the current emphasis on availability and interoperability of molecular data. Publishing standards now dictate that the data should be deposited in a public repository such as the Gene Expression Omnibus, ArrayExpress, or Center for Information Biology Gene Expression Database, and that they should meet standards of content and interoperability such as the Minimum Information About a Microarray Experiment protocols. Why, then, should criteria for publication not reflect those aims?
Most molecular databases don't deserve prominent publication in their own right, of course. The technical quality must be high, and, equally important, the data must be of more than parochial significance. The Genome Project sequence is at one end of the spectrum, important to almost every laboratory doing biological or biomedical research; sparse molecular characteristics of particular cell types are at the other end, often of value only to the investigators themselves. Molecular profiling data on the NCI-60 fall somewhere between the two extremes. The data are of interest to thousands of laboratories, both for their basic biological uses and for their connection to cancer therapeutics.
This is not the place for a full-scale review of the NCI-60 panel and its molecular profiles. But a brief summary will be useful to motivate what follows. The panel was initially assembled in the late 1980s by Michael Boyd and colleagues at the DTP, under the aegis of Division Director Bruce Chabner to provide a tissue-specific screening capability (1). Largely through pioneering analyses by the late Kenneth Paull (5), it soon developed a second personalityas a system for profiling the compounds and natural product extracts tested against it. Studies in the laboratories of Tito Fojo, Susan Bates, and Robert Shoemaker added molecular characterization of the cells with respect to MDR1 and other drug resistance transporters (1214). Broad omic profiling of the cells had its inception in a discussion in Bruce Chabner's office. I challenged him to list the molecules he would most like to see profiled in the cells. To my astonishment, he provided a list the next week. Our opening salvo, in the mid-1990s, was a two-dimensional gel study with Leigh Anderson that produced a database of 1,014 spots indexed over all 60 cell lines. The data were integrated through clustered heat map visualizations of the type that have since then become the ever-present visual icon of post-genomic biology (15). When we submitted the article to a prominent (non-AACR) journal that shall remain nameless, it was promptly rejected without review by an editor who asked, "Where's the hypothesis?" We later published the study elsewhere (16). Numerous such experiences that we and others have had highlight the need for the current spotlight series. Four of our database-heavy publications on the NCI-60 over the last decade have thus far accumulated >400 literature citations each (15, 1719), but all four were initially dismissed by journals, largely because they were viewed as lacking a hypothesis.
Figure 1 shows a schematic we use to organize our thinking about the NCI-60 databases (15). The assay run by DTP produces a database (A) of activities, which can be mapped into molecular structures of the compounds tested (S) or into molecular targets and other characteristics of the cells (T). If other cell or tissue typese.g., transfectants, knock-downs, knock-outs, clinical tumorsare profiled in a compatible way, then it's possible to extrapolate the phenomenal pharmacologic characterization of the NCI-60 panel to the additional sample types without actually doing the assays in those samples. Often, the additional assays would be impossible to do, especially if materials are limited, as they generally are with clinical tumors.
|
|
The value of molecular profile inventories is central to The Cancer Genome Atlas project, which is just getting under way as a joint 3-year pilot project of the NCI and the National Human Genome Research Institute. Originally, that project was to focus on the resequencing of a large number of human tumors. It was then realized that the value would be increased enormously by the inclusion of other types of profiling, at least at the genomic, epigenomic, and transcriptomic levels. Currently, the Cancer Genome Atlas's plans are much less expansive than those in Fig. 2, but they must function in the much more difficult context of clinical tumors (lung cancer, ovarian cancer, and glioma for the pilot project). In a sense, the NCI-60 profiling enterprise can be thought of as a stalking-horse for the Cancer Genome Atlasdoing in cell lines what will be much harder to do in clinical tumors.
Everyone knows the limitations of cell lines as surrogates for clinical cancers. Even primary cultures have been removed from the influence of cytokines, hormones, three-dimensional architecture, and the community of other cell types in a tumor. Furthermore, cell lines have been adapted or selected for survival and rapid proliferation on plastic. The NCI-60, in particular, have the disadvantage (or advantage) that they represent diverse lineages, and there are only 60 of themmore than enough for some analyses but too few for others. On the other side of the ledger, the lines are homogeneous in lineage, available in unlimited numbers, manipulable (e.g., by transfection), and useful for high-throughput drug assays. Furthermore, they make it possible to step into the same stream over and over again, and the screening data have a major legacy value. Thankfully, cell lines don't tend to raise issues of informed consent, and they rarely sue for intellectual property rights. The Cancer Genome Atlas does face issues of confidentiality and intellectual property, as well as the inevitably difficult decisions about how best to use the finite, irreplaceable clinical materials. It also faces the inhomogeneity of cell type in clinical tumors, a problem that will become even more acute if it turns out that we really want information on rare stem cells in a tumor or on cells at the proliferating, invading margin, or information on well-oxygenated cells near blood vessels.
In my view, then, the cell lines should be considered, first and foremost, as instances of biology in their own right. Most of our knowledge about cell biology, physiology, and pharmacology has come from a study of the cell lines. In that context, the NCI-60 metadata and molecular profiles often prove useful when one wants to choose a parental cell type with particular characteristics for transfection or other experimental manipulations. When we're predicting toward the clinic, however, it's caveat emptor. As with any model system, there will be leads and there will be mis-leads. It's necessary to find clues that generate testable hypotheses without worrying too much about the clues that don't work out. As usual in science, one has to find a personally comfortable balance between following up the most improbable observation (which may be the most important) and following up the prosaic ones that are more likely to bear fruit.
Figure 2 includes three levels of validation studies. The first, exemplified by real-time reverse transcription-PCR, simply tests the technical accuracy of microarray data. More substantively, the second level, small interfering RNA knock-down, provides a way to turn correlative information from the NCI-60 into causal information, and the third level, use of tissue arrays, tests in real tumors the hypotheses that arise from NCI-60 data. We most often derive biomedically useful knowledge from the NCI-60 by integrating the various data types with each otherthe integromic analysis (4, 5)and then working back and forth iteratively between those data, the validation data, and information on clinical cancers. That process is often "seat-of-the-pants" more than it is statistical; we scramble for clues to formulate new hypotheses, we try to corroborate old ones, or we find ways in which the old ones can be enriched.
To illustrate those ways of approaching and using the data, I'll briefly mention several published instances in which we and our collaborators have made that sort of extrapolation from the NCI-60 in the context of molecular therapeutics. Because this commentary isn't intended as a comprehensive review, with apologies I won't try to do justice to the many others around the world who have made excellent use of the information.
Each of those brief descriptions indicates how partial information can be put together from multiple sources, including the NCI-60 data, for basic molecular pharmacology, drug discovery, or biomarker identification. Two further examples are published in this issue as flagship articles for the Spotlight on Molecular Profiling series:
Biology, as exemplified by the 18th century taxonomics of Linnaeus, was once a primarily observational science. In the 19th century, the most influential insight in the history of science had its origin in observational studies of finch beaks in the Galapagos. In the 20th century, the pendulum (particularly in biomedical research) then swung decisively toward the hypothesis-driven. Now, in the 21st century, it's time for the pendulum to swing back toward the center. Comprehensive understanding of biological systemsand application of that understanding to biomedical problemswill require a synergistic combination of hypothesis-driven and omic, discovery-based research strategies (10).
The pendulum is indeed swinging, but slowly. Large institutions and scientific fields don't change their cultures overnight. Most editors, reviewers, study sections, and site visitors in the academic world are still addicted to the hypothesis-driven paradigm as a standard of judgment. That remains true despite the obvious practical and conceptual importance of the Genome Project and its aftermath. This innovative Spotlight on Molecular Profiling series nudges the pendulum back toward equilibrium. The articles in it will include various proportions of hypothesis-driven and omic research. But always, the emphasis will be on high quality and early availability of the data so that other researchers can search or mine the molecular profiles according to their interests and domain expertise. Most particularly, the promise is that the molecular profiling data highlighted will promote the overall goals of 21st century personalized medicine.
Acknowledgments
Past and present staff of the NCI DTP deserve the research community's gratitude for establishing and conducting the NCI-60 screen over the years. I particularly want to remember the contributions of Michael Boyd and Bruce Chabner, who originated the screen, and the late Kenneth Paull, who pioneered informatic analysis of the screen data. I also want to thank our many collaborators and other scientists around the world for the molecular profile databases they've generated on the NCI-60.
Footnotes
Grant support: The Genomics and Bioinformatics Group's research is supported by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.
Received 10/16/06; accepted 10/17/06.
References
This article has been cited by other articles:
![]() |
O. N. Ikediobi, M. Reimers, S. Durinck, P. E. Blower, A. P. Futreal, M. R. Stratton, and J. N. Weinstein In vitro differential sensitivity of melanomas to phenothiazines is based on the presence of codon 600 BRAF mutation Mol. Cancer Ther., June 1, 2008; 7(6): 1337 - 1346. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Kavlock, G. Ankley, J. Blancato, M. Breen, R. Conolly, D. Dix, K. Houck, E. Hubal, R. Judson, J. Rabinowitz, et al. Computational Toxicology--A State of the Science Mini Review Toxicol. Sci., May 1, 2008; 103(1): 14 - 27. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Amundson, K. T. Do, L. C. Vinikoor, R. A. Lee, C. A. Koch-Paiz, J. Ahn, M. Reimers, Y. Chen, D. A. Scudiero, J. N. Weinstein, et al. Integrating Global Gene Expression and Radiation Survival Parameters across the 60 Cell Lines of the National Cancer Institute Anticancer Drug Screen Cancer Res., January 15, 2008; 68(2): 415 - 424. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. E. Blower, J.-H. Chung, J. S. Verducci, S. Lin, J.-K. Park, Z. Dai, C.-G. Liu, T. D. Schmittgen, W. C. Reinhold, C. M. Croce, et al. MicroRNAs modulate the chemosensitivity of tumor cells Mol. Cancer Ther., January 1, 2008; 7(1): 1 - 9. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Stevens, S. Nishizuka, S. Antony, M. Reimers, S. Varma, L. Young, P. J. Munson, J. N. Weinstein, E. C. Kohn, and Y. Pommier Predicting cisplatin and trabectedin drug sensitivity in ovarian and colon cancers Mol. Cancer Ther., January 1, 2008; 7(1): 10 - 18. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Liu, B. R. Zeeberg, G. Qu, A. G. Koru, A. Ferrucci, A. Kahn, M. C. Ryan, A. Nuhanovic, P. J. Munson, W. C. Reinhold, et al. AffyProbeMiner: a web resource for computing or retrieving accurately redefined Affymetrix probe sets Bioinformatics, September 15, 2007; 23(18): 2385 - 2390. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. E. Blower, J. S. Verducci, S. Lin, J. Zhou, J.-H. Chung, Z. Dai, C.-G. Liu, W. Reinhold, P. L. Lorenzi, E. P. Kaldjian, et al. MicroRNA expression profiles for the NCI-60 cancer cell panel Mol. Cancer Ther., May 1, 2007; 6(5): 1483 - 1491. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. T. Shankavaram, W. C. Reinhold, S. Nishizuka, S. Major, D. Morita, K. K. Chary, M. A. Reimers, U. Scherf, A. Kahn, D. Dolginow, et al. Transcript and protein expression profiles of the NCI-60 cancer cell panel: an integromic microarray study Mol. Cancer Ther., March 1, 2007; 6(3): 820 - 832. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. C. Reinhold, M. A. Reimers, A. K. Maunakea, S. Kim, S. Lababidi, U. Scherf, U. T. Shankavaram, M. S. Ziegler, C. Stewart, H. Kouros-Mehr, et al. Detailed DNA methylation profiles of the E-cadherin promoter in the NCI-60 cancer cells Mol. Cancer Ther., February 1, 2007; 6(2): 391 - 403. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |