Technologies for neoantigen discovery are critical for the development of personalized cancer therapies and neoantigen-based biomarkers. Precision neoantigen discovery entails comprehensive detection of tumor-specific genomic variants and accurate prediction of MHC presentation of epitopes originating from such variants. ImmunoID NeXT™ enables a comprehensive survey of putative neoantigens by combining highly sensitive exome- scale DNA and RNA sequencing with the NeoantigenID™ analytics engine.
The NeoantigenID analytics engine flows seamlessly from biological samples to neoantigen prediction (Figure 4), as follows:
- First, the sequencing data from the NeXT Exome and Transcriptome are run through the NeXT DNA and RNA pipeline in order to identify tumor-specific small variants and fusions.
- Highly accurate germline in silico HLA typing is performed, followed by somatic mutation and allele-specific loss of heterozygosity (LOH) detection in the HLA genes.
- Tumor-specific small variants and fusions are combined with the patient HLA types and gene expression information to predict neoantigens using our internally-developed Systematic HLA Epitope Ranking Pan Algorithm (SHERPA™).
- SHERPA has been integrated into Personalis’ NeoantigenID analytics engine along with additional secondary metrics that enable further prioritization of the predicted putative neoantigen candidates.
- Accurate neoantigen prediction with SHERPA enables the determination of candidate neoantigens for rapid development of personalized cancer therapies, as well as facilitating the generation of neoantigen burden-based composite biomarkers such as the Personalis NEOantigen Presentation Score (NEOPS) that can potentially better predict response to immunotherapies.
Figure 4: NeoantigenID Analytics Engine
The NeoantigenID reporting output consists of:
- A list of all putative neoantigens and their comprehensive characterization
- In silico HLA genotyping, as well as somatic alterations impacting HLA genes incuding SNVs, indels, and HLA LOH
- A NeoantigenID Summary Report containing NEOPS, neoantigen burden, and tumor mutational burden (TMB).
SHERPA™: Systematic HLA Epitope Ranking Pan Algorithm
HLA binding is currently the most well-established criteria for ranking neoantigen candidates. Recent advances in training data generated from mass spectrometry provide a larger dataset of peptide binders and non-binders for individual HLA alleles. This new binding data takes two important additional components into consideration: cleavage and transportation, which are critically important for presentation assessment.
We leveraged this advancement by developing our Systematic HLA Epitope Ranking Pan Algorithm (SHERPA), our pan-predictive machine learning model for predicting MHC class I presentation.
SHERPA utilizes proprietary, high quality immunopeptidomics data, publicly available & curated mono- and multi-allelic data, as well as binding affinity data as a training set (Figure 5). Publicly available multi-allelic data from several tissue types were systematically reprocessed and deconvoluted to capture the diverse facets of antigen processing and presentation. The integration of different training data types resulted into decreased bias, increased generalizability, and improved performance of SHERPA.
Figure 5: SHERPA: Neoantigen Machine Learning Algorithm using Proprietary Engineered Cell lines & Mass Spec Data
Multiple modeling strategies were combined to accurately predict neoantigens for all known alleles. The SHERPA-Binding algorithm uses both the peptide and binding pocket information to predict a binding rank. The SHERPA-Presentation algorithm incorporates additional, critical features such as expression level of the source protein, proteasomal cleavage, and gene propensity to predict a more comprehensive presentation rank (Figure 6).
Figure 6: SHERPA-Binding and SHERPA-Presentation Prediction Models
The performance of SHERPA was evaluated on ~10% held-out mono-allelic data set, mixed with negative examples in a 1:999 ratio (commonly assumed prevalence). The Precision-recall curves demonstrate that SHERPA models have consistently higher precision at all recall values compared to other publicly available prediction algorithms (Figure 7A). Both SHERPA models also have better PPV compared to publicly available prediction tools (Figure 7B). SHERPA-Presentation has a better PPV compared to SHERPA-Binding model, attesting to the utility of presentation-specific features. Boxplots in Figure 7B denote the distributions of positive predictive values (top 0.1%) across alleles within the mono-allelic immunopeptidomics held-out test data. Distributions are shown to compare SHERPA with other publicly available models.
Figure 7A and B: SHERPA Enables Superior Neoantigen Presentation Prediction
In addition to its status as an emerging biomarker of interest in the era of cancer immunotherapy, HLA genotyping is also an essential component of the neoantigen prediction process. Personalis’ HLA typing tool, HLA-Map, has been integrated into the NeoantigenID analytics engine; enabling the highly-accurate in silico typing of all HLA Class I and Class II loci, which is critical for ensuring the precision of downstream peptide-MHC-binding predictions.
To confirm the accuracy of HLA-Map, we performed a comprehensive analytical validation study. This validation study was performed on a total of 15 proficiency testing samples with known, but blinded HLA genotype profiles. Ten of these samples were sourced from the American Society of Histocompatibility and Immunogenetics (ASHI) and five additional samples were obtained from the College of American Pathologists (CAP). Each of these samples had previously been independently genotyped via various orthogonal clinical tests, and these results against which our own results were compared. As is demonstrated in the table below, HLA-Map performed exceptionally well in accurately genotyping not only the HLA Class I loci, but also the more challenging HLA Class II loci.
Table 1: HLA-Map’s HLA genotyping performance for both HLA Class I and Class II loci.
|HLA Loci||Number of Alleles||Number of Correct Calls||HLAHM Concordance|
|All Class I||90||90||100.0%|
|All Class II||180||177||98.3%|
|All Class I + Class II||270||267||98.9%|
DASH: Deletion of Allele-Specific HLAs
The success of immune checkpoint blockade (ICB) has revolutionized cancer treatment. However, the fact that the majority of cancer patients do not respond favorably to such immunotherapies has resulted in an explosion in the breadth of research efforts to identify new biomarkers of response and/or resistance to these new class of cancer therapeutics.
Given that the mechanism of action of these therapies is contingent on the dynamic interplay between the tumor and the host’s immune system, the role of the antigen processing machinery (APM) in ensuring that tumor-specific neoantigens are successfully presented to the adaptive immune cells has garnered increasing attention in the search for more effective biomarkers. More specifically, loss of heterozygosity (LOH) impacting the HLA Class I genes has emerged as a means by which solid tumors can evade immunosurveillance by reducing the repertoire of neoantigens that can be presented to the immune system, and this phenomenon is now recognized as a key resistance mechanism to ICB (McGranahan et al., 2017; Tran et al., 2016).
In line with our goal to provide our partners with the most comprehensive cancer immunogenomics platform, Personalis has endeavored to enable the accurate assessment of HLA LOH with NeXT. Through DASH (Deletion of Allele-Specific HLAs), we have created a machine-learning-based tool to capture the unique features associated with each individual HLA Class I region which, when combined with the ACE-augmented sequencing data generated by the NeXT assays, enables us to accurately assess HLA LOH using a novel NGS-based approach.
In order to validate our performance, we assessed the limit of detection (LOD) of DASH using three tumor-Normal cell line pairs with HLA LOH in at least one locus. we sub-sampled the tumor sequencing data and mixed it with complementary normal sequencing data to achieve simulated purity levels. Next, we mixed the HLA-mapping reads across a range of ratios to simulate the potential spectrum of tumor purities and sub-clonalities. Both LOHHLA and DASH have nearly perfect specificity (>99%, data not shown) across tumor purities and sub-clonalities.
For fully clonal HLA LOH events, consistent sensitivity is achieved with >25% tumor purity for both algorithms. However, DASH has significantly higher sensitivity to detect sub-clonal events than LOHHLA (Figure 8).
Figure 8: DASH has significantly better sensitivity than LOHHLA for sub-clonal HLA LOH events.
Additional validation studies utilizing several novel, orthogonal methods have been completed and the results of these studies can be found here.
We use machine learning and neural networks to provide you with advanced analytics to better inform your discovery and translational research programs.
ImmunogenomicsID guides the investigation of critical immuno-oncology genes with information including expression, variant effect impact, and DNA/RNA allelic fractions. Unlike targeted therapies, there tends to be general agreement that it is unlikely that a single predictive biomarker in tumor biopsies will be found for determining response to immunotherapies. Thus, multidimensional biomarker analysis is needed to accurately assess patient response. The Personalis Immunogenomics Engine enables the ability to look across critical areas to characterize tumor biology for focused analysis.
Rapidly evaluate the tumor biology of a sample in key areas including:
- Antigen Processing Machinery (APM)
Translational research empowers a better understanding of the pathways that tumor cells use to evade immunosurveillance. Detecting critical mutations in genes such as HLA and B2M are important to comprehend potential mechanisms of acquired resistance to immunotherapies
- Repair and Replication
Microsatellite instability High (MSI-H) or DNA mismatch repair-deficiency have emerged as promising predictive biomarkers of response or non-response to ICB. In 2017, the FDA approved the use of pembrolizumab for any MSI-H, advanced stage solid tumor in what was the first tumor site-agnostic, biomarker-based FDA-approval for a cancer drug.
- Checkpoint Modulation
The activation of T-cells is regulated by both stimulatory (e.g. OX40, 4-1BB, etc.) and inhibitory (e.g. PD-L1, IDO1, etc.) signals. Evaluating the tumors checkpoint ligand expression is key to understanding the likely mechanisms of tumor escape. ImmunogenomicsID provides insights into each of these ligands and their respective pathways, providing comprehensive gene and expression information for each relevant gene.
Since roughly 12% of all cancers are associated with the presence of an oncogenic viral infection, it is imperative for any comprehensive immunogenomics platform to enable the detection of oncoviral genomes. the detection of seven of the most common oncoviruses – as well as their associated genotypes and/or subtypes – in both DNA and RNA derived from tumor samples: HPV, HBV, HCV, EBV, KSHV, MCV, and HTLV.
Shared antigens or TAAs are common to specific tumor lineages and can be used as targets for adoptive cell therapies and non-personalized cancer vaccines. ImmunogenomicsID includes data on critical genes associated with shared antigens including PRAME, MAGE, SSX2, MUC1, etc.
- Adaptive and Innate Immune Response
Mounting an effective immune response following the administration of immunotherapies relies on the coordinated impact of not only the adaptive immune system, but also the innate immune system. ImmunogenomicsID reports on genes associated with both types of immune response to better understand the activity profile of the immune infiltrate through DNA and RNA data on genes such as AIF1, IL2, IRF1, and VCAM1.
- Cytokines and Chemokines
ImmunogenomicsID reports on interleukins and chemokines to further elucidate the tumor microenvironment. Chemokines such as CXCL10, CXCL9, and CXCL11 stimulate cytocidal activity in the immune infiltrate. Additionally, there is also a strong association between cytotoxic activity and the expression of genes involved in attracting T-helper cells to the tumor site including interleukins and CSCL1, 9, 10, 11, and CXCR3 among others.
Cytotoxic factors such as granzymes (GZMs), perforin 1 (PRF1) and granulysin (GNLY) are released by cells of the immune system (e.g. NK cells or cytotoxic T-cells) and are essential for their cytotoxic activity against cancer cells. Evaluating the expression of these factors can help to determine the degree of cytotoxic activity within a tumor. ImmunogenomicsID provides information on genes such as GNLY, GZMA, GZMB, and PRF1.
InfiltrateID utilizes the single-sample gene set enrichment analysis (ssGSEA) approach to compute transcriptome-based enrichment scores for eight distinct immune cell types (Table 1) from a single tumor sample, quantifying the abundance of those populations within the TME of that sample. For this purpose, we created in-house proprietary cell type-specific signatures for the eight distinct immune cell types usingNeXT Transcriptome gene expression data, derived from purified immune cell populations. Each signature consists of genes curated based on a strict selection criterion, requiring consistent expression and low expression variability in each of the eight immune cell types.
Table 2: Immune Cell Populations Profiled by InfiltrateID and Their Respective Roles and Relevance in Cancer
|Immune Cell Type||Roles and Relevance in Cancer|
|B-cells||B-cells are the primary drivers of the humoral immune response; generating B-cell receptors (BCRs) and antibodies which enable the host to mount an immune response against a wide range of antigens/pathogens (Sharanov et al., 2020).|
|Dendritic cells – Conventional (cDCs)||In the context of the TME, cDCs can present antigens derived from tumor cells to T-cell receptors (TCRs) to stimulate an adaptive immune response. Presence of cDCs in the TME is typically associated with better prognosis (Böttcher et al., 2018).|
|Dendritic cells – Plasmacytoid (pDCs)||pDCs are best known for their regulatory effects, including the rapid and large production of type I interferons. Functional impairment of pDCs have been implicated in creating an immunosuppressive TME in cancers (Koucky et al., 2019).|
|Macrophages||Circulating monocytes are recruited to the TME by chemotaxis and a subset of them can differentiate into tumorassociated macrophages (TAMs). TAMs play a prominent role in the formation of an immunosuppressive TME by producing chemokines and cytokines (Lin et al., 2019).|
|NK cells||NK cells are effector immune cells that are capable of direct cell-killing. The elevated presence of NK cells in the TME of solid tumors is generally considered an indication of good prognosis (Habif et al., 2019).|
|The pre-existence of elevated numbers of tumor-infiltrating lymphocytes (TILs) (specifically, CD8+ T-cells) has been associated with improved prognostic effects and has also correlated with beneficial response to ICM therapy in many solid tumor types including melanoma, colorectal, and triple-negative breast cancer (Maimela et al., 2019).|
|CD4+ T-cells play a significant role in mediating the immune response via the secretion of specific cytokines and subsequent activation and expansion of CD8+ T-cell and antibody production by B-cells (Lai et al., 2011).|
|Tregs are a subset of CD4+ T-cells known for their immunosuppressive influence, mediated by mechanisms including production of immunosuppressive cytokines such as IL-10 and TGFβ and the conversion of ATP into adenosine (Li et al., 2020). Unsurprisingly, tumor infiltration of Tregs is associated with poor prognosis in many cancer types including melanoma, NSCLC, gastric, and ovarian (Kim et al., 2020).|