Erin Newburn, MS, PhD
Senior Manager, Field Applications Scientist

Sean Michael Boyle, MS, PhD
Manager, Bioinformatics Applications

Overcoming the obstacles of neoantigen identification

The promise of neoantigen-based cancer vaccines

Neoantigen-based cancer vaccines represent new, truly personalized therapeutic approaches to treating cancer. The development of personalized vaccines relies heavily on accurate, comprehensive, and rapid characterization of the immunogenomics of a patient’s tumor, thus enabling identification of a target set of neoantigens for vaccine synthesis.  The ultimate quality of the neoantigen identification is dependent on the entire process working robustly from sample, to sequencing, to neoantigen analysis.

Here we discuss key challenges in this process and the technology we have developed for addressing them.

Obtaining broad tumor immunogenomic sequencing data

Typical 50 to 500 gene, DNA-based diagnostic cancer panels are insufficient for identifying putative neoantigens given they can result from mutations in any gene.  Exome-scale sequencing, at significant depth (≥200X) is needed to identify neoantigens in the context of heterogeneous tumor samples. Furthermore, neoantigen prediction relies on sequencing DNA from both tumor and matched normal samples in order to eliminate germline variants, thus ensuring detection of neoantigens that are tumor specific. Finally, DNA sequencing alone is not sufficient. Transcriptome-scale RNA sequencing is a critical component of comprehensive tumor immunogenomic characterization as it enables identification of neoantigens that are expressed as well as potential immune escape mechanisms through immune signatures, HLA status, and other biomarkers.  To address these needs, our ACE ImmunoID platform combines deep augmented exome sequencing (tumor & normal) with transcriptome sequencing to enable neoantigen identification and broad tumor immunogenomic characterization.

Filling NGS sequencing gaps to achieve clinical grade sequencing

Identification of neoantigens for personalized vaccine use requires comprehensive next-generation sequencing (NGS) coverage of genes. Unfortunately, conventional whole exome sequencing (WES) assays can have poorly covered genomic regions due to systematic biases such as regions of high GC content. These coverage gaps can lead to missing potentially immunogenic variants. To address these issues and achieve clinical grade gene coverage, our augmented sequencing technology called ACE optimizes chemistry and capture to fill in sequencing gaps to minimize the chance of missed neoantigen targets. We have shown this approach results in superior sensitivity for variant detection and high levels of gene finishing (Patwardhan, A et al. Genome Medicine 2015 7:71; Ashley, E et al. Nature Reviews Genetics 2016 17:507).

Deriving high quality DNA/RNA from low sample amounts and FFPE

Extracting sufficient quantities of high quality DNA and RNA from challenging tumor sample types such as formalin-fixed and paraffin-embedded (FFPE) tissue is a well-documented challenge. Fixation conditions can result in varying degrees of nucleic acid degradation, ultimately impacting sequencing data quality and neoantigen identification.  Our DNA/RNA dual extraction protocols have been developed to allow sufficient quantities of nucleic acid to be extracted from limited FFPE samples, enabling reliable exome/transcriptome sequencing from as little as 50 ng of material.

Predicting putative neoantigens from tumor immunogenomic data

Multiple key challenges need to be addressed in going from sequencing data to predicted, ranked neoantigens:

HLA typing

Identifying a patient’s human leukocyte antigen (HLA) haplotype is an important input into predicting which neoepitopes are likely to be immunogenic for a particular patient’s tumor. However, the complexity and diversity of the HLA genes can make accurate typing challenging.

Determining the correct peptides

Somatic mutations, including single nucleotide variants (SNVs), indels, and fusions, each produce different protein products. While SNVs result in single amino acid changes, indels and fusions can create multiple frame-shift protein products through alternative splicing. As genes also have many transcripts, a single somatic mutation can result in numerous protein products. Collectively, accurate peptide usage relies on inclusion of frame shift events, proper transcript selection, application of variant phasing, and consideration of variant expression.

MHC binding prediction

Each distinct peptide sequence has the potential to be processed by the proteasome, transported for MHC loading, bound for presentation, and ultimately recognized by the immune system. Current machine learning approaches focused on predicting neoepitope binding to MHC Class I and II are hampered by limited training data for most HLA alleles. These algorithms are also often myopically focused on epitope binding prediction with relatively little emphasis on the impact of the upstream antigen presenting machinery.  This can result in low sensitivity and specificity of the predictions for many HLA alleles.

Ranking neoantigens

In a tumor, 10s to 1,000s of non-synonymous mutations can result in 100s to 10,000s of potential neoepitopes. Reducing this candidate set to a smaller set of candidate neoantigens requires taking into account factors such as predicted MHC binding affinity, variant allele frequency, gene expression, self similarity, immunogenicity and other factors.

To address the challenges above, we developed a pipeline for neoantigen prediction (Figure 1).  This pipeline determines HLA type and predicts neoantigens using DNA and RNA sequencing data derived from our ACE ImmunoID platform. The pipeline identifies and annotates putative neoantigens based on peptide processing, MHC binding prediction, similarity to self and known antigens, and immunogenicity.

Figure 1. Neoantigen discovery process

Other features of our neoantigen discovery pipeline include Class II MHC binding predictions, detecting indel and fusion-derived peptides, and determining the phasing of neoantigens. We are also developing advanced machine learning tools and training data that can increase the sensitivity and specificity of the neoantigen predictions.

To evaluate pipeline performance, we began with a proof-of-principle experiment using established immunogenic peptides (taken from Cancer Immunity). Briefly, peptides were reverse engineered to generate the derived variants and in silico spiked into cancer samples. After analysis, 22 out of 23 peptides were accurately detected by our pipeline, resulting in 96% sensitivity for detecting known immunogenic peptides. Additional validation studies are ongoing to address specificity and expand the validation set.

Identifying tumor escape mechanisms that may modulate vaccine response

A critical part of rational, personalized vaccine development involves taking into account immune modulators of response.  Tumors have many mechanisms for immune evasion and for modulating immune response through perturbation of the antigen presenting machinery, DNA repair and replication, tumor microenvironment, inflammation pathways, and checkpoint inhibitors/activators. Our ACE ImmunoProfile analytics enables these factors to be assessed to help drive rational vaccine design.

Validating assay and analytics to enable therapeutic use

In order to use these technologies as part of a vaccine development process, the assays and analytics must be rigorously validated.  Sensitivity, specificity, and limits of detection for variants need to be established, especially when dealing with heterogeneous tumor samples.   To achieve this, the ACE ImmunoID platform has been analytically validated with best-in-class performance for SNP, indel and fusion detection, which are especially critical for neoantigen identification.

Achieving rapid turnaround time, faster vaccine development

Because of the need to turn around a personalized vaccine for a patient within a 1-2 month window, it is critical for every step of the process to be as efficient and timely as possible.  To achieve this, we have developed the laboratory automation processes and workflow management systems to enable delivery of sample to report within 14 days, in a CLIA validated process.


The ACE ImmunoID platform combines technological innovation from sample preparation to neoantigen identification to address some of the key challenges in achieving high quality neoantigen identification. ACE ImmunoID enables a broad understanding of tumor immunogenomics unique to each patient’s tumor — including neoantigens, tumor microenvironment, and tumor escape mechanisms – critical information for helping drive rational vaccine design and other personalized immuno-oncology therapeutics.