An overview of PanCancer Atlas analyses on oncogenic molecular processes
Germline genome affects somatic genomic landscape in a pathway-dependent fashion
Genome mutations impact expression, signaling, and multi-omic profiles
Mutation burdens and drivers influence immune-cell composition in microenvironment
The Cancer Genome Atlas (TCGA) has catalyzed systematic characterization of diverse genomic alterations underlying human cancers. At this historic junction marking the completion of genomic characterization of over 11,000 tumors from 33 cancer types, we present our current understanding of the molecular processes governing oncogenesis. We illustrate our insights into cancer through synthesis of the findings of the TCGA PanCancer Atlas project on three facets of oncogenesis: (1) somatic driver mutations, germline pathogenic variants, and their interactions in the tumor; (2) the influence of the tumor genome and epigenome on transcriptome and proteome; and (3) the relationship between tumor and the microenvironment, including implications for drugs targeting driver events and immunotherapies. These results will anchor future characterization of rare and common tumor types, primary and relapsed tumors, and cancers across ancestry groups and will guide the deployment of clinical genomic sequencing.
In the nearly half century of the “War on Cancer,” prevention and treatment have progressed significantly, but many forms of the disease remain incurable. The advent of large-scale DNA sequencing ushered in new possibilities. Beginning with coding regions (Sj?blom et al., 2006), sequencing has sparked a revolution in cancer research. Genomic studies have identified numerous cancer driver genes (Kandoth et al., 2013, Lawrence et al., 2014) and germline variants that increase disease susceptibility (Lu et al., 2015). We increasingly understand the molecular determinants of oncogenesis, including tumor suppressor inactivation and pathway alteration. Significant progress has been made in identifying driver mutations (Porta-Pardo et al., 2017), assessing their druggability (Niu et al., 2016), disease subtyping (Waddell et al., 2015), prognosis (Cancer Genome Atlas Research Network et al., 2015), and residual disease detection (Martinez-Lopez et al., 2014).
Gene and protein expression are also key aspects. Studies have reported new fusions (Klijn et al., 2015), alternatively spliced transcripts (Oltean and Bates, 2014), expression-based stratification (Stricker et al., 2017), and implications of viral infection (Cao et al., 2016). Proteomic studies have made progress on subtyping (Lawrence et al., 2015), biomarker identification (Sogawa et al., 2016), and drug sensitivity and resistance (Ji et al., 2017). Advancements have also been made in immune response (Bieging et al., 2014), infiltrate-based subtyping (Akbani et al., 2015), associations of PD-1/PD-L1 with prognosis (Danilova et al., 2016), interactions between immune reprogramming and angiogenesis (Tian et al., 2017), and immune cytolytic activity (Rooney et al., 2015). Each area shows enormous promise.
The era of the first large genome sequences was called the “end of the beginning” of genomics. It seems fitting to call the conclusion of The Cancer Genome Atlas (TCGA) the end of the beginning of cancer genomics. TCGA has systematized large-scale genomics-based cancer research, with its projects and data on 11,000 tumors from 33 cancer types having led to enormous advancements. The TCGA PanCancer Atlas project has a special focus on the oncogenic processes governing cancer development and progression, with its ten analysis working groups (AWGs) presenting their findings. Together we synthesized findings from consensus somatic mutation calling, fusion detection, splicing events, aneuploidy, image analysis, and the immune system in oncogenesis (Figure 1). Here, we concentrate on three themes: (1) interactions between somatic drivers and germline pathogenic variants; (2) links across genomic substrates, i.e., methylome, transcriptome, and proteome; and (3) tumor microenvironment and implications for targeted and immune therapies. We begin each section with an overview from AWG results and follow with additional analyses addressing questions not explored in individual AWG papers. The results of the PanCancer Atlas project will provide a foundation for subsequent phases of deeper, broader, and more sophisticated work that holds great promise for personalized cancer care.
Previous TCGA studies often concentrated on focal copy-number alterations rather than chromosomal-level aneuploidy. The PanCancer Atlas Aneuploidy AWG systematically quantified aneuploidy (Taylor et al., 2018), correlated its degree with genomic features, such as TP53 status, mutational load, and level of lymphocytic infiltrate, and provided experimental evidence confirming some predictions.
Gene fusions, which can drive overexpression or create fusion proteins, are another important class of drivers. The Fusion AWG systematically characterized fusions (Gao et al., 2018), finding that they are recurrent and disease defining in some neoplasms (e.g., SS18/SSX1 or SSX2 fusion in synovial sarcoma). In others, fusion drivers are present in small subsets of tumors (ALK or ROS1 fusions in lung adenocarcinoma). The accompanying mutational events and how they differ among cancers provide functional insights (Gao et al., 2018).
Two other AWGs systematically characterized germline and somatic variants across 33 cancer types (Table S1) (Huang et al., 2018, Ellrott et al., 2018). They generated and analyzed 1.5 billion germline (Huang et al., 2018) and ～3.6 million somatic calls (Ellrott et al., 2018), making TCGA PanCancer Atlas the largest resource for investigating joint variant contributions to cancer. The germline group highlighted the two-hit hypothesis through loss of heterozygosity (LOH) and compound heterozygosity, rare copy-number events, and additional evidence supporting variant pathogenicity. The somatic dataset anchored a comprehensive analysis using 26 bioinformatic tools, identifying 299 driver genes and over 3,400 oncogenic mutations (Bailey et al., 2018). Similarly, the PanCancer Atlas Germline group identified >800 pathogenic or likely pathogenic germline variants in 99 predisposition genes affecting ～8% of all cases (Huang et al., 2018).
Here, we used the 299 driver and 99 predisposition genes to study interactions of germline and somatic events in 9,389 samples (STAR Methods; Table S1). Many predisposition genes play roles in genome integrity (Figure 2A, green bars; Table S2). Alterations in these genes represent a higher fraction of germline variants (63%, 490/769) versus somatic drivers (14%, 8850/75825, p value = 7e?151 Fisher’s Exact Test), highlighting the role of genome integrity in cancer predisposition. The remaining somatic alterations are largely from genes involved in cell cycle, epigenetic modifiers, metabolism, oncogenic signaling, and transcriptional/translational regulation. We surveyed the frequency of cases showing disruptions of genome integrity in individual cancer types. Of the eight molecular process categories examined (STAR Methods), genome integrity dominates both germline and somatic alterations in ovarian serous cystadenocarcinoma (OV) due to BRCA1 or BRCA2 predisposition variants and a high fraction of TP53 mutations. Other cancers are further skewed with respect to percent of cases carrying mutations involved in genome integrity; i.e., 4% of samples in lung squamous cell carcinoma (LUSC) have germline compared to 89% somatic (Figure 2B; Table S3).
Most predisposition genes affecting genome integrity (64%, 23/36) belong to the Core DDR (DNA damage response) genes (Knijnenburg et al., 2018) (Table S2). Several show high germline variant counts, including BRCA1, BRCA2, CHEK2, ATM, BRIP1, PALB2, and PMS2. When considering germline and somatic mutations jointly, the most frequently mutated genes are BRCA1 and BRCA2, together having 854 (571 samples) somatic and 153 (152 samples) germline mutations. We grouped samples with germline mutations, somatic, or no/low-impact mutations in these two genes by cancer type to establish associations between age of onset and somatic mutation load. Patients with germline BRCA1/2 mutations develop cancer at younger ages compared to wild-type samples in OV, LUSC, and BRCA (false discovery rate [FDR] 9.12e?6, 9.23e?3, and 1.15e?2, respectively, t test). Mean age of diagnosis in patients with germline mutations is 54.4 ± 13.0 years (standard deviation), compared to 62.3 ± 13.4 years when the mutation is somatic across the pan-cancer cohort (p value = 2.07e?10, 95% confidence interval [CI] = (?10.27, ?5.57); Figure 3A; Table S4). As expected, germline or somatic variants associate with higher mutation load across cancer types (Figure 3B), being observed in OV samples with germline BRCA1/2 mutations (FDR 3e?3, t test) and BLCA, STAD somatic (FDR 5.6e?3, 9.2e?6, t test).
Many samples (250 out of 1,464) with non-synonymous somatic mutations in DNA mismatch repair (MMR) genes have high microsatellite instability (MSI) status (MSIsensor score ≥4; Figure 3C; Table S5) (Niu et al., 2014). Samples with germline pathogenic variants in MMR genes (18 out of 60) also have high MSI status. Notably, 16 of these 18 samples have both predisposition germline variants and somatic mutations in MMR genes (Table S2), representing a population with potentially higher neoantigen load and response to checkpoint-blockade therapy. Indeed, samples with MSIsensor scores ≥4 had higher expression of immune-response marker genes (GZMA, PRF1, GZMK, and GZMH) in the three cancer types with enough MSI high samples: colon adenocarcinoma and rectum adenocarcinoma (COADREAD), stomach adenocarcinoma (STAD), and uterine corpus endometrial carcinoma (UCEC) (two-sample Kolmogorov-Smirnov p < 0.01; Figure 3D). This highlights the influence of mutations and MMR genes and the MSI phenotype in the immune response against tumors. Finally, using Moonlight we found several pathways that are differentially expressed depending on whether the mutations affecting BRCA1 and/or BRCA2 are somatic or germline (Figures 3E, 3F, and S1). For example, BRCA samples with somatic mutations in BRCA1/2 downregulate genes involved in antigen processing and leukocyte cytotoxicity, whereas BRCA samples with germline BRCA1/2 mutations downregulate genes involved in mitochondrial respiratory chain complex and metabolic pathways. The impact of BRCA1/2 mutations may depend on both their somatic or germline status and the tissue of origin.
Interactions among somatic driver genes, ranging from sequential dynamics to interactions of pathway and synthetic lethality, hold potential for therapeutic exploitation. We used the MC3 somatic mutation (Ellrott et al., 2018) dataset and the driver gene list (Bailey et al., 2018) to identify pairs of drivers that are mutually exclusive or tend to co-occur (STAR Methods). We found an extensive network of interactions (Cochran-Mantel-Haenszel test FDR < 0.1; Figure 4A; Table S6). TP53 is the prime hub, co-occurring with IDH1, ATRX, PPP2R1A, RB1, and CDKN2A and mutually exclusive of PIK3CA, HRAS, CTNNB1, ARID1A, and FGFR3. As expected, driver genes and mutations that act via certain pathways/mechanisms show strong exclusivity, a primary example being BRAF and HRAS/NRAS/KRAS, all of which affect the Ras signaling pathway. Other examples are pairs of homologous genes, such as IDH1/IDH2 and GNAQ/GNA11, and interacting genes, such as PIK3CA and PIK3R1. These patterns held across virtually all 33 tumor types, indicating discovery of a key oncogenic relationship. We also observed exclusivity in specific tissues (Figure 4B), for example BRAF, NRAS, and HRAS in thyroid carcinoma (THCA) and GNAQ and GNA11 in uveal melanoma.
At a larger scale, some cancer types require cooperation between gene networks. For example, in UCEC, there are two mutually exclusive networks, the first consisting of TP53 and PPP2R1A (and occasionally PTEN) and the second CTNNB1, PTEN, and CTCF. This is consistent with previous descriptions of UCEC subtypes, with TP53-driven endometrial tumors having a copy-number high phenotype and PTEN-driven endometrial tumors being copy-number low or hypermutated (either via MSI and/or POLE). Finally, we observed cancer-specific somatic-somatic interactions. For instance, TP53 and KRAS are mutually exclusive in COAD, READ, and LUAD (Table S6) but significantly co-occur in PAAD (Table S6). These observations highlight the importance of investigating both at the pan-cancer level and by tissue of origin (Park and Lehner, 2015).
The tumor genome and transcriptome interact at multiple levels. For example, 1%–2% of genome mutations have detectable effects on splicing, with potential to alter the transcriptome and biochemical pathways (Wang and Cooper, 2007). Locally, cis-mutations can disrupt or activate splicing factor binding sites or splice sites. The Splicing AWG analyzed 8,656 TCGA tumors, finding that 1,964 mostly missense and synonymous mutations create novel splice junctions (Table S1) (Jayasinghe et al., 2018). They also produce neoantigens, often accompanied by an elevated immune response. Mutations in splice-governing genes result in large-scale abnormal splicing, providing potential biomarkers and therapeutic targets (Dvinge et al., 2016) and acting as proto-oncogenes or tumor suppressors (Yoshida et al., 2011). The Spliceosome Pathway AWG surveyed 33 tumor types for somatic mutations of over 400 splicing factor genes, identifying 119 genes with likely driver mutations (Seiler et al., 2018). They confirmed aberrant splicing of frequently mutated genes, suggesting that splicing de-regulation in cancer is broader than previously reported.
Integrating profiles from individual molecular platforms can provide insights into the molecular state of tumors and identify samples with shared regulation (sample clusters) across multiple assays. A recent analysis (Hoadley et al., 2018) performed clustering of individual platforms and subsequent clustering of cluster assignments (COCA) (Hoadley et al., 2014) on clusters derived from aneuploidy levels (10 clusters; 10,522 samples), mRNA (25 clusters with at least 40 samples; 10,165 samples), miRNA (microRNA) (15 clusters; 10,170 samples), DNA methylation (25; 10,814), and reverse phase protein array (RPPA) (10; 7,858). They also performed integrative molecular subtyping with the iCluster method (Shen et al., 2009) in a joint analysis of aneuploidy, DNA methylation, mRNA, and miRNA levels across 9,759 tumor samples, identifying 28 iClusters. Consistent with previous multiplatform analyses (Hoadley et al., 2014), samples cluster primarily by tissue of origin.
We analyzed the impact of somatic mutations in the cis-expression of driver genes. We grouped samples for each gene according to whether they contained frameshift or nonsense mutations (group I), missense (group II), or no mutations (group III). This analysis shows clear upregulation of cancer driver genes affected by missense mutations and downregulation of those affected by nonsense or frameshift mutations (Figures 4C and 4D; Table S7), consistent with previous findings (Hu et al., 2017, Alvarez et al., 2016). We observed reduced expression for tumor suppressors, such as ATRX, BRCA1, NF1, and RB1, and elevated expression of oncogenes, like EGFR and KIT (FDR < 0.1; Figure 4E). We highlight the top 15 genes showing significant expression differences between at least two of the three groups in at least one cancer type (Figures 4F, 4G, and S2). In most cases, the frameshift/nonsense group had significantly lower mRNA than the others, consistent with the hypothesis that they induce nonsense-mediated decay (NMD) (Lindeboom et al., 2016). The exception is GATA3 in breast cancer, where samples with frameshift or nonsense mutations have higher mRNA levels (FDR = 4.54e?18 Welch’s test; Figure 4G), likely because GATA3 frameshift mutations can have gain-of-function, oncogenic effect (Mair et al., 2016). In cases such as CASP8, samples with missense mutations also overexpress the driver gene (FDR < 0.1; Figure 4G).
We used Moonlight to identify gene programs that are differentially expressed in each of the two mutated conditions when compared against non-mutated samples (Figure 4H; Method Details). Remarkably, several genes seem to affect different transcriptional programs, depending on the type of mutation affecting them. Following on the GATA3 mutations in BRCA, samples with frameshift/nonsense mutations associate with downregulated genes related to microtubule dynamics or organization of cytoskeleton, an effect not seen in those with missense mutations. Similar effects also happen with CDH1 in BRCA: samples with nonsense and frameshift mutations associate with upregulated genes involved in leukocyte migration but not in samples with missense CDH1 mutations. The tissue of origin seems to also influence the transcriptional effects. For example, lower grade glioma (LGG) samples with any kind of TP53 mutations associate with downregulated expression of leukocyte migration genes, but the expression of these genes remains unaltered in LIHC or BRCA samples with TP53 mutations (Figure 4H). Overall, associations of driver mutations and the transcriptome of the cancer cell seem to be affected by both the original cell type and the type of driver gene mutation.
Driver mutations often affect the expression of interacting genes and genes in the same pathway. We investigated this phenomenon by integrating protein interaction, transcriptomic, and mutation information using OncoIMPACT (Figure 5A). To reveal key deregulated oncogenic processes occurring in each cancer type, we calculated the fraction of patients for which an oncogenic process was associated with a driver mutation (Figure 5B). With few exceptions (e.g., KIRC), general tumorigenic processes, such as cell proliferation, death, signaling, and motility, are frequently deregulated across cancer types. These processes are mostly deregulated by TP53, PTEN, KRAS, and PIK3CA. Processes were more frequently deregulated in some cancers (e.g., head and neck squamous cell carcinoma [HNSC], skin cutaneous melanoma [SKCM], and breast invasive carcinoma [BRCA]). We also observed associations between oncogenic process and cancer types, e.g., Calcium signaling pathway deregulation and uveal melanoma (UVM), with frequent activating mutations in GNA11 and GNAQ that are upstream members of the Calcium signaling pathway (Moore et al., 2016) and frequent deregulation of the Notch signaling pathway in bladder urothelial carcinoma (BLCA) due to inactivating driver mutations in this pathway (Rampias et al., 2014).
We also observed known pairs of significantly mutually exclusive mutated genes such as TP53 and PIK3CA (Kandoth et al., 2013) and KRAS and BRAF (Loes et al., 2016) in cell death and MAPK signaling processes (Figure 5C; permutation test, p value < 10?5), suggesting that a single driver suffices to perturb these processes and that mutations in multiple drivers are functionally interchangeable in certain contexts. In heterogeneous tumors, this functional redundancy might serve as an important source of drug resistance and metastatic clones.
Having established the connections between driver events and the transcriptome, we investigated the relationship between driver genes and the methylomic, transcriptomic, and proteomic profiles of tumors (Figure 6A). We used the clustering data from the Cell of origin AWG (Hoadley et al., 2018) to search for cluster combinations enriched in driver events (Figure 6B), identifying 40 genes associated with multiplatform clusters: TP53, KRAS, and PIK3CA mutations were enriched in ten or more multiplatform clusters, and ARID1A, BRAF, CTNNB1, KMT2D, PTEN, and APC mutations were significantly enriched in four or more clusters (Tables S8and S9).
Interestingly, we found similar multiplatform clusters that differ in their associated genes. One notable case is comprised of LGG and glioblastoma multiforme (GBM) samples, which are predominantly covered by mRNA cluster 1 and RPPA cluster C1 but which differ markedly in their methylome profiles. IDH1-driven LGGs are in methylation cluster 1, where 330 of the 351 samples carried IDH1 mutations, while EGFR-driven LGG and GBM are in methylation cluster 16 (Figure 6C). Another example is that APC- and KRAS-driven COAD/READ tumors are strongly enriched in mRNA cluster 15 and RPPA cluster C8 but separate in methylation clusters 10 and 11. Similar circumstances are observed for PIK3CA-driven BRCA tumors, which are enriched in mRNA and proteome clusters 23 and C6, respectively, but which can belong to methylation clusters 24 or 6 (Table S9).
Notably, we also found instances where specific driver genes differentiate among cluster combinations. For example, UCEC samples belong mostly to multiplatform clusters 4/18/C3 and 23/18/C3, which again differ only in methylation profile (Table S9). The first multi-cluster is enriched in ARID1A, PTEN, CTNNB1, and PIK3CA mutations and has fewer TP53 mutations. The second cluster is conversely dominated by TP53 and PPP2R1A mutations, indicating that differences in driver prevalences can be reflected in the methylation profile (Table S9). While multiplatform clusters are largely driven by tissue of origin (Figure 6D), they may also be affected by the mutations that drive tumor growth.
A third frontier involves interactions between cancer cells and the tumor microenvironment (TME), comprising stromal cells and the immune infiltrate. Results from the Immune Response Working Group (IRWG) (Thorsson et al., 2018) indicate that the TME can be characterized as belonging to one of six immune subtypes, namely wound healing (C1), IFN-γ dominant (C2), inflammatory (C3), lymphocyte depleted (C4), immunologically quiet (C5), and TGF-β dominant (C6) (Tables S8 and S10).
While immune signatures can infer levels of lymphocytic infiltrates in tumors, they provide no information on spatial distribution of the lymphocytes. The Imagine Analysis Working Group exploited high-resolution imaging of hematoxylin and eosin (H&E) to estimate tumor-associated lymphocyte densities and infiltration patterns across all samples from 13 of the 33 TCGA tumor types (Saltz et al., 2018). These data revealed relationships between degree of lymphocytic infiltrates measured by gene expression and feature extraction from imaging data using machine learning. Further correlations were made with cancer molecular subtypes, oncogenic events, and outcome, highlighting the power of the underutilized image resources of the TCGA.
Here, we further study the relationship between specific driver events, composition of the immune infiltrate, and the signaling network among different cell types within distinct immune subtypes. The networks identified for each immune subtype (STAR Methods) might be relevant to identifying synergistic interventions between targeted drugs and immunotherapies.
BRAF-driven tumors have a higher proportion of CD8 T cells than NRAS-driven tumors (ANOVA p < 2e?5 in both cases) (Figure 7A; Table S11) in the C3 immune subtype. Elevated CD8 T cell proportion, considered an important effector of checkpoint inhibition (Ji et al., 2012), correlates with better outcomes. We also identified a signaling loop involving CD8 T cells, CD274 (PD-L1), and PCDC1 (PD-1) (Method Details) in C3, where targeting BRAF and PD-L1 might have synergistic effects. The analysis also reveals an interesting network within the C5 subtype. Samples having mutations in ATRX or TP53 have higher presence of macrophages and lower of CD8 (ANOVA p < 2e?8 in both cases). Interestingly, these macrophages secrete HMGB1, which promotes proliferation and metastasis in glioma (Bassi et al., 2008), a prominent cancer type in C5.
Driver mutations in KRAS/NRAS/HRAS and BRAF V600 are among the most frequently predicted neoantigens in cancer (Thorsson et al., 2018) and could thus, as presented peptides, be directly steering immune response. Additionally, driver-gene mutations may impact the transcriptional regulation that guides immune response. For example, IDH1-driven gliomas associate with lower levels of STAT1, which can decrease levels of immune infiltrate by ultimately decreasing the secretion of CXCL10, a critical chemokine for T cell trafficking in brain (Kohanbash et al., 2017). Also, models of transcriptional networks (Thorsson et al., 2018) implicate Ras family members and other driver genes in transcriptional control of genes affecting TME composition.
Another way in which somatic mutations interact with the immune system is through neoantigens presented on class I or II major histocompatibility complex (MHC) proteins, which can activate immune cells. This has been studied by various PanCancer Atlas groups, describing splice-creating mutations and fusion events creating immunogenic neoantigens (Jayasinghe et al., 2018, Gao et al., 2018) and neoantigens based on the derived HLA type and their predicted binding affinity (Thorsson et al., 2018).
Using neoantigen predictions and immune infiltrate composition, we investigated associations between numbers of presented neoantigens and relative proportion of immune cells comprising immune subtypes (Table S12). These associations differ by immune subtype (Figure 7B). C2 has the greatest overall immune activity. Here, the CD8 T cell fraction increases with neoantigen load (FDR < 1e?15; Figure 7C), suggesting that CD8 T cells may respond to neoantigen burden. CD4 T cell fraction and neutrophil fraction increase in relation to neoantigen burden in C3, perhaps reflective of the overall balanced immune response and good prognosis of C3 tumors (FDR < 1e?25; Figure 7C). Macrophages have greater infiltration with neoantigen burden in C5, which contains many gliomas and for which TAMs (tumor-associated macrophages) support tumor growth (FDR < 5e?3; Figure 7C).
This study summarizes and expands the findings of the TCGA PanCancer Atlas project investigating oncogenic processes. The germline genome has far-ranging, pathway-dependent influences on the somatic landscape, often promoting somatic mutations. Interactions between driver genes and the transcriptome are context dependent, as is the impact of driver mutations in both cis- and trans-expression. Some oncogenic processes that tend to be deregulated in few cancer types, such as cell adhesion, are more related to specific genes rather than to prominent drivers. Findings also suggest that networks involving driver mutations, cell types, and cytokines might be used as blueprints for combining two or more immunomodulatory therapies (Tian et al., 2017) in selected tumors.
In summary, this work illuminates the complex milieu of oncogenic processes by integrating an enormous corpus of data obtained over the course of TCGA into organized themes. In effect, biomedical science is now graduating from studying the tumor in isolation to assessing it within its larger environmental context. The findings described here suggest drastic changes in clinical practice and drug development. For example, molecular treatments will increasingly be developed with “multi-omics.” This strategy is being used to create small molecule inhibitors for druggable mutations (Drilon et al., 2017), mutation signatures (Davies et al., 2017), gene expression (Li et al., 2017), immunotherapeutic agents (Le et al., 2017), and vaccines (Ott et al., 2017). Bioinformatic systems will help efficiently design optimized treatment plans lurking within large combinatorial spaces with respect to dosage, efficacy, side effects, etc.
As we look to the future, there are many questions. For example, we are only beginning to realize that oncogenic mutations, such as BRAF V600E, frequently occur in healthy people (Martincorena et al., 2015). Could some somatic mutations be tolerated in normal development? If so, how does this impact our understanding of oncogenic mutations? TCGA data come mostly from primary tumors, yet patients usually succumb to metastases; can we find the alterations that drive this process? The next leaps to be taken by the Cancer Moonshot Initiative and Human Tumor Atlas Network (HTAN) will involve pre-cancer, primary, and metastatic tumors associated with treatment sensitivity or resistance and will advance the multidimensional mapping of human cancers over time for informing future cancer research and clinical decision-making.
We thank patients who contributed to this study and the NCI Office of Cancer Genomics and acknowledge NIH grants U54 HG003273, U54 HG003067, U54 HG003079, U24 CA143799, U24 CA143835, U24 CA143840, U24 CA143843, U24 CA143845, U24 CA143848, U24 CA143858, U24 CA143866, U24 CA143867, U24 CA143882, U24 CA143883, U24 CA144025, U24 CA211006, and P30 CA016672.
L.D., G.G., and D.A.W. conceived the project. L.D. supervised the project. M.C.W., A.J.L., E.P.-P., M.H.B., S.S., A.W., K.H., V.T., A.C., D.B., R.J., F.C., L.Y., and L.D. drafted the manuscript. J.M.S., G.B.M., C.M.H., J.C.Z., D.A.W., G.G., and L.D. provided scientific input. M.H.B., M.A.W., and E.P.-P. produced figures. Analysis was performed by M.H.B., E.P.-P., K.H., A.C., C.O., I.C.-C., J.K., C.T., A.W., D.B., C.S., N.N., R.J., F.C., L.Y., K.A.H., R.A., V.T., D.L.G., I.S., B.G.V., and A.J.L. All authors approved submission.
Michael Seiler, Peter G. Smith, Ping Zhu, Silvia Buonamici, and Lihua Yu are employees of H3 Biomedicine, Inc. Parts of this work are the subject of a patent application: WO2017040526 titled “Splice variants associated with neomorphic sf3b1 mutants.” Shouyoung Peng, Anant A. Agrawal, James Palacino, and Teng Teng are employees of H3 Biomedicine, Inc. Andrew D. Cherniack, Ashton C. Berger, and Galen F. Gao receive research support from Bayer Pharmaceuticals. Gordon B. Mills serves on the External Scientific Review Board of Astrazeneca. Anil Sood is on the Scientific Advisory Board for Kiyatec and is a shareholder in BioPath. Jonathan S. Serody receives funding from Merck, Inc. Kyle R. Covington is an employee of Castle Biosciences, Inc. Preethi H. Gunaratne is founder, CSO, and shareholder of NextmiRNA Therapeutics. Christina Yau is a part-time employee/consultant at NantOmics. Franz X. Schaub is an employee and shareholder of SEngine Precision Medicine, Inc. Carla Grandori is an employee, founder, and shareholder of SEngine Precision Medicine, Inc. Robert N. Eisenman is a member of the Scientific Advisory Boards and shareholder of Shenogen Pharma and Kronos Bio. Daniel J. Weisenberger is a consultant for Zymo Research Corporation. Joshua M. Stuart is the founder of Five3 Genomics and shareholder of NantOmics. Marc T. Goodman receives research support from Merck, Inc. Andrew J. Gentles is a consultant for Cibermed. Charles M. Perou is an equity stock holder, consultant, and Board of Directors member of BioClassifier and GeneCentric Diagnostics and is also listed as an inventor on patent applications on the Breast PAM50 and Lung Cancer Subtyping assays. Matthew Meyerson receives research support from Bayer Pharmaceuticals; is an equity holder in, consultant for, and Scientific Advisory Board chair for OrigiMed; and is an inventor of a patent for EGFR mutation diagnosis in lung cancer, licensed to LabCorp. Eduard Porta-Pardo is an inventor of a patent for domainXplorer. Han Liang is a shareholder and scientific advisor of Precision Scientific and Eagle Nebula. Da Yang is an inventor on a pending patent application describing the use of antisense oligonucleotides against specific lncRNA sequence as diagnostic and therapeutic tools. Yonghong Xiao was an employee and shareholder of TESARO, Inc. Bin Feng is an employee and shareholder of TESARO, Inc. Carter Van Waes received research funding for the study of IAP inhibitor ASTX660 through a Cooperative Agreement between NIDCD, NIH, and Astex Pharmaceuticals. Raunaq Malhotra is an employee and shareholder of Seven Bridges, Inc. Peter W. Laird serves on the Scientific Advisory Board for AnchorDx. Joel Tepper is a consultant at EMD Serono. Kenneth Wang serves on the Advisory Board for Boston Scientific, Microtech, and Olympus. Andrea Califano is a founder, shareholder, and advisory board member of DarwinHealth, Inc. and a shareholder and advisory board member of Tempus, Inc. Toni K. Choueiri serves as needed on advisory boards for Bristol-Myers Squibb, Merck, and Roche. Lawrence Kwong receives research support from Array BioPharma. Sharon E. Plon is a member of the Scientific Advisory Board for Baylor Genetics Laboratory. Beth Y. Karlan serves on the Advisory Board of Invitae.