As part of the Spotlight on Molecular Profiling series, we present here new profiling studies of mRNA and microRNA expression for the 60 cell lines of the National Cancer Institute (NCI) Developmental Therapeutics program (DTP) drug screen (NCI-60) using the 41,000-probe Agilent Whole Human Genome Oligo Microarray and the 15,000-feature Agilent Human microRNA Microarray V2. The expression levels of ∼21,000 genes and 723 human microRNAs were measured. These profiling studies include quadruplicate technical replicates for six and eight cell lines for mRNA and microRNA, respectively, and duplicates for the remaining cell lines. The resulting data sets are freely available and searchable online in our CellMiner database. The result indicates high reproducibility for both platforms and an essential biological similarity across the various cell types. The mRNA and microRNA expression levels were integrated with our previously published 1,429-compound database of anticancer activity obtained from the NCI DTP drug screen. Large blocks of both mRNAs and microRNAs were identified with predominately unidirectional correlations to ∼1,300 drugs, including 121 drugs with known mechanisms of action. The data sets presented here will facilitate the identification of groups of mRNAs, microRNAs, and drugs that potentially affect and interact with one another. Mol Cancer Ther; 9(5); 1080–91. ©2010 AACR.
This article is featured in Highlights of This Issue, p. 1075
Micro RNAs (microRNAs) are small noncoding RNAs, approximately 18 to 24 nucleotides in length, that regulate the levels of their target mRNAs in a highly multiplexed way (1–5). More than 700 human microRNAs have been sequenced according to the latest miRBase release (Release 14). They are abundantly present in all human cells and have been estimated to target at least ∼60% of all genes (2, 6, 7). It is also well-established that microRNAs regulate cell proliferation and apoptosis, both of which are critical processes in cancer (8, 9). microRNAs have been reported to play roles in a number of human malignancies, including leukemias and breast, lung, liver, brain, and colon cancers (1, 10–13). For example, mir-15a and mir-16-1 are frequently deleted or downregulated in chronic lymphocytic leukemia (14). More than half of the known microRNA sequences are located in chromosomal regions that are often genetically altered in human cancer, for example at fragile sites or regions of deletion or amplification (11). Overexpression of microRNAs (such as mir-155) in cancers implies their possible function as oncogenes through the negative regulation of tumor suppressor genes and/or genes that inhibit cell differentiation or apoptosis (8, 15, 16). Conversly, some microRNAs (such as let-7d and mir-127) that are underexpressed in cancers may function as tumor suppressors by inhibiting oncogenes and/or genes that control cell differentiation or apoptosis (15, 16).
To study the relationships among expression levels of the various mRNAs and microRNAs, as well as their correlations with drug activity, the National Cancer Institute (NCI)-60 panel of human tumor cancer cell lines was used (17, 18). The panel, which consists of 60 cell lines from nine tissues of origin, includes melanomas (ME), leukemias (LE), and cancers of breast (BR), kidney (RE), ovary (OV), prostate (PR), lung (LC), central nervous systems (CNS), and colon (CO) origin. Several mRNA expression profiling studies of the NCI-60 have been reported previously (17, 19–21), and microRNA profiling studies of the NCI-60 have also been done on 241 human microRNAs using stem-loop real-time PCR (22), and on 321 human microRNAs using microarray (23). Clustering analyses of mRNA and microRNA expression in those studies revealed that some cell types (leukemia, colon, CNS, and melanoma) are grouped in a manner reflecting their tissues of origin. Expression levels of mRNAs and microRNAs were also significantly associated with drug sensitivity or resistance (20, 22, 23).
To expand and improve our mRNA and microRNA expression data (24), we have profiled the NCI-60 using the Agilent Whole Human Genome Oligo Microarray (mRNA_Agilent) and Agilent Human microRNA Microarray (V2; microRNA_Agilent), which interrogate ∼21,000 genes and 723 microRNAs, respectively. Both profiles include technical replicates. This report describes the generation, quality-control, and integrative analyses of the profile data sets obtained from the two platforms, focusing on the correlations of mRNA and microRNA expression with drug activity (leaving detailed examination of the linkages between mRNAs and microRNAs for a future report). Previous molecular profiles of the NCI-60 by our research group have appeared in the Spotlight on Molecular Profiling series (17, 24–29) and elsewhere (20, 21, 30, 31). The profiling data, metadata, and SQL capabilities for querying them are freely available in our CellMiner relational database program (32), where a listing of our Spotlight series publications is kept current.
Materials and Methods
Frozen stocks of the NCI-60 were obtained from the NCI Developmental Therapeutics program (NCI DTP). The cells were cultured as described previously (20, 24), and then thawed, placed in RPMI 1640 (Lonza Walkersville, Inc.) containing 5% fetal calf serum (Atlantic Biologicals) and 2 mM glutamine (Invitrogen Corporation). For compatibility with our other profiling studies, we used the same batch of serum used by DTP, and the procedures were done or overseen by the same researcher (WCR).
mRNA and microRNA purification and quality assessment
For profiling of mRNA, total RNA was extracted using the RNeasy purification kit (Qiagen, Inc.) according to the manufacturer's instructions. For profiling of microRNA, total RNA was extracted using Trizol (Invitrogen Corporation) following the manufacturer's recommended procedures. The samples were assayed using the 2100 Expert_Eukaryotic Total RNA Pico assay, and the 2100 Expert_Small RNA Bioanalyzer assay (Agilent Technologies). Most of the samples showed an RNA Integrity Number (RIN) greater than 9 according to the Total RNA Pico assay. Percentages of microRNA, as determined by the Small RNA assay, varied among samples.
RNA labeling, microarray hybridization, profiling, and quality control
The mRNA samples were labeled and processed following the Agilent Technologies One-Color Microarray-Based Gene Expression Analysis Protocol version 5.5 with a total RNA input of 200 ng. Six cell samples from the NCI-60 panel (BR:MCF7, CO:HCT116, CO:HT29, LE:K562, ME:SK-MEL-2, and RE:CAKl-1) were labeled in quadruplicate, and the remaining samples were labeled in duplicate. All 132 labeled samples were hybridized to the Agilent Whole Human Genome Oligo Microarray (G4112F, design ID 014850, Agilent Technologies), which contains 41,000 probes measuring around 21,000 unique genes.
For the microRNA profiling, 100 ng of total RNA was labeled as recommended by Agilent Technologies (microRNA Microarray System Protocol v.1.5), using T4 RNA ligase from USB. Most samples were labeled in duplicate, but eight samples were labeled in quadruplicate (the six mentioned above for mRNA plus LC:A549-ATCC and PR:PC-3). The 136 labeled samples were hybridized to the version 2 Human microRNA microarray (G4470B, design ID 019118, Agilent Technologies), which assesses 723 Human microRNAs and 23 Human viral microRNAs.
The microarrays for profiling both mRNA and microRNA were scanned and data extracted as recommended in the respective protocols. After data extraction with Feature Extraction (version 9.5, Agilent Technologies), the data quality was confirmed using the Agilent quality control metrics. All mRNA microarrays passed the quality control step, whereas two microRNA microarrays with borderline quality (one for LE:SR, the other for OV:NCI/ADR-RES) were excluded from further data analysis.
The drug database
A note about nomenclature: we use the term “drug” to indicate chemical compounds tested in the NCI-60 DTP human tumor cell line screen, which was designed to screen up to 3,000 compounds per year for potential anticancer activity (33). The screen uses the NCI-60 to prioritize drugs showing selective growth inhibition or cell killing of particular tumor cell lines. To assess potential associations between drug activity and the expression level of either mRNAs or microRNAs, we used the 50% growth inhibitory concentrations (GI50) determined by DTP. Those data were further curated by our research group available through CellMiner (34). In our analysis, we used our A1429 data set (35), and also our data set of activity for 121 drugs with putatively known mechanism of actions (36) extracted from the A4463 data set (37). A1429 and A4463 are available through CellMiner. The activity levels are expressed as the negative log of the 50% growth inhibitory concentration [−log10(GI50)], measured using a 48-hour sulphorhodamine B assay.
Note that there is a significant difference between the mRNA and microRNA microarrays: from prior reports the majority of the microRNAs are either not expressed or are expressed at low levels, whereas approximately 60 to 70% of mRNAs, in general, are expressed in any individual tissue (39, 40). Consequently, we included in our analysis all mRNA probes, but only a subset of the microRNAs. We chose the subset of 365 (of the 723 total) microRNAs that had detectable expression in at least 10% of the cell lines for correlation and clustering analysis. We considered a microRNA to be expressed at a detectable level in a cell line if the corresponding value of the variable “Isdetected” from the Agilent Scanner software Feature Extractor was true in more than half of the corresponding technical replicates, where “Isdetected” is a Boolean variable to indicate if the signal is three times higher than the background noise. Data on all mRNA probes and the 365 microRNAs were normalized using GeneSpring GX by (i) setting any gProcessedSignal (mRNA_Agilent) or gTotalGeneSignal (microRNA_Agilent) value less than 5 to 5; (ii) transforming the gProcessedSignal or gTotalGeneSignal to Logbase 2; and (iii) normalizing per array to the 75th percentile of mRNA probes and microRNAs, respectively. After normalization, an intensity value ≥0 indicates that the probe was in the top quartile. Average intensities for all 41,000 mRNA probes and 365 microRNAs were obtained for each cell line by averaging the normalized data across replicates.
For the comparison of probe intensities (Fig. 1A-D), we used all 41,000 mRNA probes and the 365 microRNAs. For both the cell-cell (Fig. 1A and C) and tissue-tissue (Fig. 1B and D) comparisons, all possible pair-wise correlations for the available replicates were calculated, and then averaged.
We limited our mRNA clustering (Fig. 2A) to probes that showed relatively high and diverse expression across the NCI-60. For each mRNA probe (p), we calculated two values, maximum probe intensity, max(p), and probe inter-quartile range, IQR(p), across the NCI-60. There were 3,032 probes (out of 41,000) that were contained in both the top quartile of max(p) and the top quartile of IQR(p). Those probes were used in our mRNA clustering analysis (Fig. 2A). Given the error structures routinely seen in microarray data (in the log frame), restriction of the clustering to only these high-expression and high-diversity data generally yields higher quality clustering. The general expression distribution analysis of mRNA probes was based on the number of cell lines having top quartile expression of the corresponding mRNA probe (Fig. 3A).
The microRNA clustering analysis (Fig. 2B) was carried out on the 365 microRNAs that had detectable expression in at least 10% of the cell lines. To provide the complete range of data for the general expression distribution analysis, Fig. 3B was obtained to show the distribution of microRNAs on the basis of the number of cell lines with detectable expression in at least one cell line for the corresponding microRNA (Fig. 3B).
To assess relationships between drug activity and expression of mRNA (Fig. 4A), we generated CIMs (41) by: (i) forming a matrix of the Pearson correlation coefficient for each drug's 60-element vector of activities across the cell lines with each probe's 60-element vector of mRNA expression levels across the cell lines; (ii) clustering rows and columns of the resulting matrix; and (iii) quantile-color coding of the resulting matrix.
The clustering analysis shown in Fig. 4A was done on the A1429 drug set, and 2,566 probes selected as follows: (i) for each probe, obtain the maximum absolute correlation value across 1,429 drugs, and (ii) select the top 6.25% probes ranked according to their maximum absolute correlation values. Figure 4B shows the corresponding visualizations for the 1,429 drugs and the 365 microRNAs.
For each of the 41,000 mRNA probes or 365 microRNAs, we calculated the number of drugs with significant positive correlations at P < 0.05 (correlation, R greater than 0.254) and denoted it as Rpos. Similarly, we obtained Rneg, the number of drugs with significant negative correlations with either the mRNA or microRNA. Then the ratio of the maximum of Rpos and Rneg (Rmax) to the sum of Rpos and Rneg (Rsum) was calculated, that is Rmax/Rsum. This calculation yields the ratio of drugs having significant correlations along the dominant direction, either positive or negative. This ratio has a range between 0.5 and 1. A ratio close to 1 indicates the majority of the significant correlations are in one direction. The two histograms in Supplementary Fig. S1 show the distribution of the number of mRNA probes and microRNAs with respect to the ratios.
All clustering analyses were average-linkage agglomerative hierarchical. The metric used for clustering based on expression levels or drug sensitivities across the NCI-60 was the Pearson correlation coefficient. For clustering based on correlations (i.e., between drug sensitivities and either mRNA or microRNA expression levels), the metric used was the Euclidean distance.
Cell and tissue-of-origin correlations of mRNA and microRNA expression levels
Microarray correlation comparisons were made for the designated cell-cell combinations for mRNAs and microRNAs in Fig. 1A and C, respectively. For the mRNA microarrays, all 41,000 probes were included. For the microRNA microarrays, 365 microRNAs with detectable expression in at least 10% of the cell lines were included. These visualizations provide both assessments of the technical reproducibility and variability among disparate cell lines.
For the mRNA technical replicates, seen as the gray diagonal in Fig. 1A (going from top left to bottom right), the mean correlation was 0.992, with a standard deviation of 0.004 and a range of 0.977 to 0.997. For the mRNA correlations from different cell lines, the mean correlation decreased to 0.892, with a standard deviation of 0.025 and a range of 0.824 to 0.987. For the microRNA technical replicates, seen as the gray diagonal in Fig. 1C (going from top left to bottom right), the mean correlation was 0.994 with a standard deviation of 0.007, and a range of 0.952 to 0.999. For the microRNA correlations from different cell lines, the mean decreased to 0.775, with a standard deviation of 0.082 and a range of 0.469 to 0.965.
The tissue-tissue mRNA and microRNA expression correlations in Fig. 1B and D, respectively, indicate the levels of variation both within and between tissues of origin. The same probe sets are used as in Fig. 1A and C. For mRNA expression comparisons of cells within a single tissue of origin, the mean correlation was 0.928, with a standard deviation of 0.017, and a range of 0.899 to 0.946. For mRNA expression for cells from different tissues of origin, the mean correlation dropped to 0.890, with a standard deviation of 0.020 and a range of 0.854 to 0.923. For microRNA expression comparisons for cells within a single tissue of origin, the mean correlation was 0.864, with a standard deviation of 0.043 and a range of 0.792 to 0.921. For microRNA expression for cells from different tissues of origin, the mean correlation dropped to 0.768, with a standard deviation of 0.068 and a range of 0.636 to 0.851.
These correlation results indicate high technical reproducibility of the Agilent mRNA and microRNA platforms. They also show the generally coherent mRNA and microRNA expressions within groups of cell lines. The most coherent groups are the colon (CO) and CNS, and the least coherent are the leukemia (LE) and breast (BR). Greater variation was generally observed for cell lines from different tissues of origin, and greater variability of expression levels among cells for microRNAs than for mRNAs.
Clustering and distribution analyses of mRNA and microRNA expression in the NCI-60
The expression patterns and clustering of 3,032 mRNA probes selected for both high level and diverse expression across the NCI-60 are shown in Fig. 2A. Figure 3A depicts the distribution of the number of cell lines expressing each of 20,146 probes with top-quartile expression levels in at least one cell line. That distribution was bimodal. Probes with generally low expression appear as the predominantly blue vertical strips in Fig. 2A and on the left side of the histogram in Fig. 3A. Probes with generally high expression appear as the predominantly red vertical strips in Fig. 2A and on the right side of the histogram in Fig. 3A. About 10% of the probes had expression levels higher than the 75th percentile in all cell lines. Clustering of the cell lines based on patterns of mRNA expression (vertical axis in Fig. 2A) showed, for the most part, clear separation by tissue of origin.
The clustering result for the 365 microRNA set expressed in at least 10% of the cell lines is shown in Fig. 2B. Figure 3B shows the histogram of the more inclusive number of microRNA with detectable expression in at least one cell line. The distribution was bimodal. Thirty percent (217 out of 723) of the microRNAs were not detectable in any of the cell lines. Probes expressed in only a small number of cell lines appear as the predominantly blue vertical strips in Fig. 2B and on the left side in Fig. 3B. Probes expressed in most of the cell lines appear as the predominantly red vertical strips in Fig. 2B and on the right side in Fig. 3B. Clustering of the cell lines based on patterns of microRNA expression (vertical axis in Fig. 2B) showed, for the most part, clear separation by tissue of origin.
Overall, the distribution of the numbers of cell lines in which either mRNA or microRNA were expressed was bimodal, i.e., most of mRNAs and microRNAs were expressed in either low or high levels in most cell lines (Fig. 3A and B, respectively). Noticeably, cell lines from the different tissues of origin tend to cluster in separate groups in terms of both mRNA and microRNA (see annotations on the right of Fig. 2A and B, respectively). The entire set of leukemia (6 out of 6 LE) and 9 out of 10 melanoma (ME) cell lines cluster together in the mRNA clustering analysis. For the microRNA, 6 out of 6 LE and 10 out of 10 ME cluster together again, indicating their relatively high degree of coherence with respect to patterns of expression. In contrast, the lung cancer (LC) and CNS cell lines tended to cluster the least both for mRNA and microRNA, indicating their relatively genetic heterogeneity. The ovarian (OV) group of cells is somewhat intermediate with 3 out of 7 cell lines (OGROV1, OVCAR-4, and OVCAR-3) clustering together for both mRNA and microRNA.
Correlation of drug activity with mRNA and microRNA expression
The CIM in Fig. 4A relates activity patterns for the previously described 1,429-drug set (35) to the expression patterns of 2,564 mRNA probes selected on the basis of high correlation to drug activities in the NCI-60. Supplementary Fig. S2 shows the same approach for the set of 121 drugs of known mechanism of action (many of them clinically used). The CIMs in Fig. 4B and Supplementary Fig. S3 relate the drug activity patterns of the same 1,429-drug and 121-drug sets to the expression patterns of the 365 microRNA set in the NCI-60. A positive correlation (red color in Fig. 4A or B and Supplementary Figs. S2 and S3) indicates that when mRNA or microRNA expression level increases, the activity level of the corresponding drug will increase and therefore the cells are more sensitive. Similarly, a negative correlation (blue color in Fig. 4A or B and Supplementary Figs. S2 and S3) indicates that when mRNA or microRNA expression levels increase, the activity level of the corresponding drug will decrease, and therefore, the cells are more resistant. The broad patterns of positive and negative correlation between mRNA or microRNA expression and drug activity reflect the existence of coherent blocks within which mRNA or microRNA levels are consistently correlated either positively or negatively to drug activities.
The histograms of the distribution of the ratios (see the Data Analysis section) for all 41,000 mRNA probes and the 365 microRNAs, in Supplementary Fig. S1A and B, respectively, support the visual observation of a predominant unidirectionality of those correlations found in Fig. 4. The mean ratio value for mRNA (in Supplementary Fig. S1A) was 0.83, indicating that 83% of the drugs have significant correlation with an mRNA, on average, in a unidirectional fashion. The mean ratio value for microRNA (in Supplementary Fig. S1B) was 0.86. Note that the selection of mRNA probes on the basis of high correlation in Fig. 4A is biasing toward the generation of patterns that might not be seen, or be as strong, if the probes were not selected in that way. However, the consistency and robustness of the dichotomous patterns seen were also supported by the histogram shown in Supplementary Fig. S1A, which was based on all mRNA probes.
The correlation analysis of mRNA and microRNA expression with drug activity shows that most drugs have a coherent relationship to substantial blocks of both transcript and microRNA expression levels, indicated by the coherent positive or negative mRNA and microRNA correlations across drugs.
Figure 5 compares the clustering results for the set of 121 drugs with defined mechanism of action purely on the basis of drug activities (Fig. 5A), on the basis of the cross-correlations of drug activities and mRNA expression levels (Fig. 5B), and on the cross-correlations of the drug activities with microRNA expression levels (Fig. 5C). In all three cases, drugs were generally clustered according to their mechanism of action classes.
The NCI-60 cell lines have been profiled more comprehensively at the DNA, RNA, protein, and pharmacological levels than any other set of cells in existence. The resulting molecular databases, therefore, constitute a uniquely valuable information resource for understanding cancer biology, assessing molecular pharmacology, and developing new approaches to analyze and interpret high-throughput molecular data. The profiling studies reported here for mRNA and microRNA expression using the recent Agilent microarrays add two important new platforms of high quality data for mRNA and microRNA expression (42).
The present data analyses show a high degree of reproducibility for both the mRNA and microRNA microarrays, as depicted by the close to 1 Pearson correlation coefficient in diagonal cells in the Fig. 1A and C CIMs. As indicated in Fig. 1A and C, and in tabular form in Fig. 1B and D, the within-tissue-of-origin correlations tend to be higher than those between tissues of different origin for both the mRNAs and the microRNAs. This is true for all of the mRNA comparisons, and all but two of the microRNA comparisons (the exceptions being the LC-PR, and OV-PR, as compared with LC-LC, and OV-OV, respectively). Those results indicate that the tissue-of-origin signature remains for both the mRNA and microRNA transcriptomes, despite adaptation of the cell lines to in vitro culture. That observation is consistent with prior results (20, 22, 23). An apparent example of disease subtyping within a tissue of origin is provided by the high correlation (r = 0.955) for mRNA between CCRF-CEM and MOLT-4 (Fig. 1A), both of which are acute lymphocytic leukemias. LOXIMVI, which is unlike the other melanoma lines by both mRNA and microRNA expression, has previously been noted to be highly differentiated and amelanotic, i.e., it lacks melanin and other melanoma signature markers (43).
CIMs of the NCI-60 based on mRNA and microRNA expression levels (Fig. 2) also indicate residual tissue-of-origin signatures that are generally consistent with those seen in prior profiling studies (20, 22, 23). Overall, there were four tissue types (i.e., leukemia, colon, renal, and melanoma) in which cell lines tend to be well clustered by both mRNA and microRNA according to their tissues of origin. The leukemias were the most clearly separated from other tissue-of-origin lines (as indicated by the height of the cluster branches in both figures). Otherwise, notable exceptions were HCC-2998 (CO) and LOXIMVI (ME) for the mRNA-based clustering, and HCT116 (CO), SN12C (RE), ACHN (RE), and LOXIMVI (ME) for the microRNA-based clustering (Fig. 2B). Those cell lines have also tended to group with different tissue-of-origin lines in prior profiling studies (20, 22, 23).
The other five tissue types were more variable. The CNS cell lines clustered well for mRNAs but poorly for microRNAs. Breast, lung, ovarian, and prostate lines formed small and/or widely distributed groups. Some cell pairs known to be similar, or to have been reported to cluster together in previous profiling studies (20, 22, 23), also clustered together here. Examples include two hormone-dependent estrogen receptor-positive breast cancer cell lines, MCF7 and T47D, OVCAR8 and its drug resistant derivative NCI/ADR-RES, and the CNS cancer cell lines U251 and SNB-19, which probably originated from the same patient (25). We identify two additional robust pairs, LC:HOP92/BR:MDA-MB-231, and OV:OVCAR3/OV:OVCAR4, which have clustered together in both the current and prior mRNA and microRNA profiling studies (20, 22, 23).
The bimodal distribution seen in Fig. 3A indicates that there are relatively large groups of mRNAs that are either expressed at high levels in a small number of cancer lines (left side of Fig. 3A and also the predominant blue strips in Fig. 2A), or expressed at high levels ubiquitously across all cell lines (right side of Fig. 3A and also the red strips in Fig. 2A). Similarly, the bimodal distribution seen in Fig. 3B indicates that there are relatively large groups of microRNAs that either have detectable expression in a small number of cell lines or have detectable expression across all cell lines. The bimodal distributions may reflect some mRNA, and microRNAs have critical housekeeping functionalities in cells whereas some have tissue- or cell line-specific functionalities.
CIM visualizations for the correlations of drug activities with expression levels of mRNA in Fig. 4A and Supplementary Fig. S2 as well as the histogram in Supplementary Fig. S1A indicate the presence of large unidirectional blocks (i.e., with either positive or negative correlation). The 1,429 drugs formed two well-separated clusters: cluster I, which consists of 146 drugs, and cluster II, which consists of the remaining 1,283 drugs. There are also two well-organized mRNA clusters. The 871 mRNAs in cluster A are predominantly negatively correlated with the drugs from cluster I, and positively correlated with the drugs from cluster II. The 1,251 mRNAs in cluster B are predominantly positively correlated with the drugs in cluster I, and negatively correlated with the drugs in cluster II. Functional analysis using KEGG (44) indicates that mRNAs in cluster A are enriched significantly with ribosome proteins, and mRNAs in cluster B are enriched significantly with focal adhesion pathway proteins. Based on UniProtKB, the remaining mRNAs (i.e., those in cluster C) are enriched with glycoproteins. The histogram in Supplementary Fig. S1A reinforces the unidirectional preponderance of the correlations, as depicted by the increase in the number of mRNAs on the right side of the histogram.
CIMs for the correlation of drug activity with the expression level of microRNA in Fig. 4B and Supplementary Fig. S3 also indicate the presence of large unidirectional blocks. As in Fig. 4A, the 1,429 drugs formed two well-separated clusters in Fig. 4A, with 115 drugs in cluster I, and the remaining 1,314 drugs in cluster II. There were 93 drugs that are common to cluster I in both mRNA and microRNA drug correlations (Fig. 4). The chemical structure and corresponding NSC numbers of some of those drugs are listed in Supplementary Fig. S4. Well-organized blocks were formed between the drugs in cluster II and the microRNAs in clusters A and B. The histogram in Supplementary Fig. S1B reinforces the unidirectional preponderance of the correlations, as depicted by the increase in the number of microRNAs on the right side of the graph.
The prior chemosensitivity study conducted by Blower and colleagues (45) provides information on three microRNAs, let-7i, mir-16, and mir-21 and their relationship to activity patterns of 8 out of the 1,429 drugs (Table 1 in ref. 45). We found that the previously described change of drug potency following either silencing or forced expression of microRNAs in the A549 cell line (45) is consistent with our correlation results (Table 1). That is, if the chemosensitivity study showed that a change in microRNA expression leads to a change in drug activity, then that same association (in the same direction) was found in our correlation analysis. For example, the correlation (in Table 1) between mir-21 expression and the −log10(GI50) value of drug NSC622700 is −0.445, which indicates that when the expression of mir-21 decreases, the drug activity increases, consistent with prior results (45).
Clustering of 121 drugs with known mechanisms of action based on: (i) the correlation between drug activity and mRNA expression (Fig. 5B and Supplementary Fig. S2) or (ii) the correlation between drug activity and microRNA expression (Fig. 5C and Supplementary Fig. S3) was largely successful in classifying the drugs by their mechanisms of action. The clustering between drug activity and mRNA was consistent with prior results (20). We find that the Top1 inhibitors and antimitotic agents generally form coherent clusters. Conversely, Top2 inhibitors form several subgroups. One consistent (Top2 inhibitor) subgroup consists of bisantrene, an anthrapyrazole derivative and deoxydoxorubicin, and another consistent subgroup contains m-AMSA, mitoxantrone, and oxanthrazole. The majority of the N-7 alkylating agents (A7) cluster into one large group, but also appeared in several smaller groups. For DNA antimetabolites, all of the purine analogues (thioguanines and thiopurines) appeared together on small branches in all clusters. There are several drug pairs consistently clustered together, including cytarabine (ara-C) and its congener cyclocytidine, mitomycin and its N-methyl derivative porfiromycin, and the two O-6 position alkylating agents CCNU and BCNU.
In summary, we have characterized the NCI-60 mRNA and microRNA data using two novel Agilent platforms, the Agilent Whole Human Genome Oligo Microarray, and the Agilent microRNA microarray Human version 2. Our analysis indicates high reproducibility for both platforms and an essential biological similarity across various cell types at a molecular level. Analyses of drug activity reveal substantial blocks of both mRNA and microRNAs whose expression levels are correlated with anticancer drug response. Future work will focus on systematic analysis of the three-way relationships of mRNA expression, microRNA expression, and drug activity by taking advantage of the high-quality profiles reported here or previously. We will also extend the approach by including new drugs recently approved by the U.S. Food and Drug Administration (FDA) to elucidate their mRNA and microRNA profiles across the NCI-60 and compare them with the “classical” drugs in the present database. Such analyses are now facilitated by free web access to our data and by the possibility of searching directly the CIM shown in Figs. 2 and 4 (46).
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
We are grateful to the many members of the National Cancer Institute's Developmental Therapeutics Program for their work on the screen and Molecular Targets Program. We particularly acknowledge the contributions of Bruce Chabner and Michael Boyd, who led development of the NCI-60, and Kenneth Paul, who pioneered the associated informatics.
Grant Support: Intramural Research Program of the National Institutes of Health, National Cancer Institute, Center for Cancer Research under contract no. NO1-CO-12400.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Disclaimer: By acceptance of this article, the publisher or recipient acknowledges the right of the United States Government to retain a nonexclusive, royalty-free license and to any copyright covering the article. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organization imply endorsement by the U.S. Government.
Note: Supplementary material for this article is available at Molecular Cancer Therapeutics Online (http://mct.aacrjournals.org/).
- Received November 3, 2009.
- Revision received March 5, 2010.
- Accepted March 9, 2010.
- ©2010 American Association for Cancer Research.