Long-read sequencing of full-length cDNAs enables the detection of structures of aberrant splicing isoforms in cancer cells. These isoforms are occasionally translated, presented by HLA molecules, and recognized as neoantigens. This study used a long-read sequencer (MinION) to construct a comprehensive catalog of aberrant splicing isoforms in non-small-cell lung cancers, by which novel isoforms and potential neoantigens are identified.
Full-length cDNA sequencing is performed using 22 cell lines, and a total of 2021 novel splicing isoforms are identified. The protein expression of some of these isoforms is then validated by proteome analysis. Ablations of a nonsense-mediated mRNA decay (NMD) factor, UPF1, and a splicing factor, SF3B1, are found to increase the proportion of aberrant transcripts. NetMHC evaluation of the binding affinities to each type of HLA molecule reveals that some of the isoforms potentially generate neoantigen candidates. We also identify aberrant splicing isoforms in seven non-small-cell lung cancer specimens. An enzyme-linked immune absorbent spot assay indicates that approximately half the peptide candidates have the potential to activate T cell responses through their interaction with HLA molecules. Finally, we estimate the number of isoforms in The Cancer Genome Atlas (TCGA) datasets by referring to the constructed catalog and found that disruption of NMD factors is significantly correlated with the number of splicing isoforms found in the TCGA-Lung Adenocarcinoma data collection.
Our results indicate that long-read sequencing of full-length cDNAs is essential for the precise identification of aberrant transcript structures in cancer cells.
Aberrant splicing isoforms detected by full-length transcriptome sequencing as transcripts of potential neoantigens in non-small cell lung cancer