Developing cancer vaccine with carcinoembryonic antigen and IGF-1R as immunostimulants using immunoinformatics approach
Article information
Abstract
Purpose
Colorectal cancer (CRC) remains a significant global health burden, necessitating innovative approaches for prevention and treatment. This study proposes a multiepitope vaccine targeting carcinoembryonic antigen (CEA) and insulin-like growth factor-1 receptor (IGF-1R), two prominent biomarkers associated with CRC progression.
Methods
Sequences of CEA and IGF-1R proteins were retrieved from NCBI databank, the sequences were aligned on the MEGA5 tool to identify conserved regions. Immunological and structural predictive analysis which include antigenic potential prediction, cytotoxic T-lymphocytes (CTLs), helper-T lymphocytes (HTLs), B-cell epitopes predictions, and prediction of the vaccine secondary and tertiary structure were performed. The vaccine was evaluated to validate its physiochemical and immunological properties. To determine the binding energy and domain, the tertiary structure of the vaccine was docked to Toll-like receptor 4, and viewed on PyMOL and LigPlot+ tools.
Results
CEA and IGF-1R were revealed to be highly antigenic, and non-allergens demonstrating the capacity to elicit robust immune responses, which include CTLs, HTLs, and B cells activation. The secondary structure revealed a conformation closely resembling native protein, with alpha helices, beta sheets, and coils, indicative of favorable interactions. Tertiary structure prediction predicted five models, model 0 was selected and validated due its highest confidence, and validation revealed that 87.5% of residues were within favored regions, with a z-score of 4.03. Molecular docking predicted strong binding complex with low binding energy.
Conclusion
Based on our analysis, the proposed multiepitope vaccine holds promise as an effective preventive measure against colorectal cancer development.
INTRODUCTION
The World Health Organization in 2023 reported that colorectal cancer ranks third as the world’s most common cancer type. Regardless of existing effective technologies for early detection and treatment aimed at lowering mortality rate, over two million new cases are diagnosed yearly, resulting in about one million deaths annually. This places colorectal cancer as the second leading cause of cancer-associated deaths worldwide [1].
Cancer therapy has evolved tremendously over the years from rudimentary surgeries to advanced, personalized, and target-specific therapies. Formerly, cancer was treated through methods like surgery and radiotherapy. These methods were effective for treating localized cancer; they are, however, limited in addressing complications like metastasized cancer, they also often cause harm to the surrounding tissues. By the mid-20th century, advancement in technology and breakthrough in chemotherapy provided more treatment options with better success rates and solutions for the challenges [2]. Today, cancer treatments are increasingly target specific and personalized, these strategies include biomarker testing for cancer, chemotherapy, hormone therapy, photodynamic therapy, immunotherapy (CAR T-cell therapy, cancer vaccines, cytokines, immunomodulators, immune checkpoint inhibitors, monoclonal antibodies), targeted therapy, surgery, stem cell transplant, radiation therapy, hyperthermia [3,4].
Cancer vaccines are a states of the art immunotherapeutic approach developed to prevent and treat cancer. There are two main types of cancer vaccines: preventive and therapeutic cancer vaccines. While preventive cancer vaccine targets infectious agents such as viruses that can lead to cancer, therapeutic vaccines are designed to treat existing cancers. Therapeutic cancer vaccines in general are cancer-associated antigens that are introduced into the body to stimulate immunogenic recognition, attack and destruction of cancer cells by the immune system. In addition to treating fully developed malignancies, cancer vaccines reduce tumor size, block or slow the proliferation of cancer cells, prevent recurrence, and eradicate cancer cells that evade other therapies [5].
Carcinoembryonic antigen (CEA) is an oncofetal antigen that is overexpressed in human adenocarcinomas, especially in gastrointestinal cancer. It is a promising target for gastrointestinal cancer-specific immunotherapy [6,7]. Their antigenic property makes them a good potential target for immunotherapy, this was reported by Larocca and Schlom [8] and Meraviglia-Crivelli et al. [9], who stated that CEA can induce a stronger CD8 T-cell cytotoxic response when animals are immunized with it. In another study, a recombinant vaccinia virus vaccine that encodes complete or internally deleted cDNAs for human CEA induced CEA-specific autoantibodies in seven out of the 32 participants involved in the study. It was reported that the participants previously had no CEA antibodies before vaccination [10].
Insulin-like growth factor-1 receptor (IGF-1R) is another biomarker implicated in the development of several human malignancies [11], including breast, pancreatic, and colorectal cancers [12]. It is reported that IGF-1R is significantly involved in cellular growth, spread, alteration, and apoptotic inhibition [13]. IGF-1R signaling causes tumor cells to be resistant to treatments such as chemotherapy and anti-hormonal therapy.
Many cancer types affecting humans have been found to overexpress IGF-1R, an instance of such observation was reported by Hakam et al. [14], who also wrote in their findings that cases of pancreatic cancer show an overexpression of IGF-1R, this claim was also report by Xu et al. [15], who reported an increased level of IGF-1R in blood plasma for every advanced stage of pancreatic cancer and a total absence of same in the surrounding non-cancerous tissues.
Owing to the role of IGF-1R in cancer cell proliferation and growth, resistance to treatment, its overexpression in specific malignant cells and lack of it in surrounding non-malignant cells, it may be suggested that immunotherapeutic approach involving the epitope IGF-1R could be a better anticancer therapy compared to the traditional methods [13].
There is an increasing amount of data supporting the fact that epitope vaccines can induce immunological response against malignant cells by activating helper T cells (Th), cytotoxic T cells, and B cells. As regards these, IGF-1R has become a prominent molecular target for anticancer therapeutic approaches.
Immunological responses are critically important in fighting cancer, adapting this response in order to treat cancer may be a vital approach in combating the recent rise in cancer morbidity and mortality. To achieve this, antigenic epitopes are introduced to the host immune system through the process referred to as vaccination [16].
An epitope with antigenic properties is a unit from the pathogen or in this case the tumor cell that can induce innate, humoral or cellular immune response [16]. Combining antigenic epitopes to develop a vaccine results into a multiepitope vaccine which is one composed of a series of overlapping chain of amino acids with the potential to elicit cytotoxic T-lymphocytes (CTLs), helper T cells (Th), B cells and also induce efficient immune responses against the targeted tumor.
Multiepitope approach for cancer therapy is gaining recognition because compared to other techniques, it is more convenient, time-saving and cost-effective [17]. Based on the above-mentioned factors, this research was focused on designing a multiepitope vaccine developed by combining CEA and IGF-1R epitope to treat and prevent colorectal cancer.
METHODS
Retrieval of sequences
Amino acid sequences for CEA (>AAA51963.1, >AAB59513.1, >AAA51967.1, >AAA62835.1) and IGF-1R (>AAI43722.1, >NP_001278787.1, >NP_000866.1, >XP_047288401.1, >XP_ 047288400.1, >XP_047288399.1, >XP_047288398.1, >XP_ 016877626.1, >BAG11657.2, >EAX02222.1, >EAX02220.1, >AAI13611.1, >AAI13613.1) were downloaded from the NCBI database (https://ncbi.nlm.nih.gov/).
Determination of conserved regions and the assessment of their antigenic and allergenic score
Conserved region determination for the two protein types (CEA and IGF-1R) was carried out on the MEGA5 tool. Using ANTIGENpro online tool on the https://scratch.proteomics.ics.uci.edu/, AllergenFP v1.0 server (https://ddg-pharmfac.net/AllergenFP/) and AllerCatPro v1.7 server (https://allercatpro.bii.a-star.edu.sg/), the antigenic and allergenic properties of the selected conserved subsequences were determined.
Prediction of immunogenic epitopes
Stimulation of cellular immunity requires proper interaction between antigens and major histocompatibility complex (MHC) molecules, therefore, to determine the potential epitopes that will properly bind to MHC molecules, the NetCTL (https://services.healthtech.dtu.dk/) and IEDB (http://www.iedb.org/) web servers were used. These web servers predict both CTL and helper-T lymphocytes (HTL) epitopes, respectively. Humoral immunity stimulating epitopes were determined on the bcpred web server (http://ailab-projects2.ist.psu.edu/bcpred/predict.html); this server predicts B-cell binding epitopes.
Construction of primary structure of the vaccine candidate
Primary vaccine construction involves linking CTL, HTL, and B-cell epitopes together by linkers, adjuvants are usually attached to vaccines to improve their antigenicity. The candidate vaccine’s primary structure was assembled by linking the predicted CTL, HTL, and B-cell epitopes together with appropriate linkers, 50S RIBOSOMAL PROTEIN L7/L12 (LOCUS RL7_MYCTU) was also attached to the N-terminal of the vaccine as adjuvant to improve the vaccine’s immunogenicity [18].
Screening for primary vaccine antigenic, allergenic, toxicity and physicochemical properties
Screening of the candidate vaccine primary construct was performed to determine its antigenicity, allergenicity, and toxicity to prospective hosts. ANTIGENpro server was used to analyze for antigenicity of a proposed candidate vaccine. The allergenic and toxicity potential of a vaccine is also analyzed on the AllergenFP v1.0 server, AllerCatPro v1.7 server, and the ToxIBTL server (https://github.com/WLYLab/ToxBTL), respectively; these are reported in Table 1 in the result section of our study. Protparam was used in analyzing the physicochemical properties of the vaccine to predict its biochemical properties.
Secondary structure prediction for candidate vaccine
Proper folding of protein and the formation of their tertiary structure depends significantly on their secondary structure. Hence, this research involved analyzing the secondary structure features such as β-sheet, α-helix and turns of the candidate vaccine using the SOPMA online tool (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.htm).
Prediction of 3D structure
Proper folding is crucial for the functionality of a polypeptide, predicting the tertiary structure of the candidate vaccine was essential to understanding its potential immune response. To accomplish this, the Swiss Model server available https://swissmodel.expasy.org was utilized as outlined in the study of [19].
Epitopes mapping of conformational Linear and discontinuous B cell
B-cell epitopes can be categorized into two types vis linear (continuous) and conformational (discontinuous) epitopes. Linear B-cell epitopes are sequential and account for a small percentage of B cells. While conformational B-cell epitopes consist of fragments of non-sequential epitopes that are exposed to solvent, they make the larger percentage of B cells. Mapping both linear and conformational epitopes is vital because antibody that can recognise linear B-cell epitopes may not necessarily recognise conformational B-cell epitopes. To map B-cell epitope for vaccine constructs, the Ellipro suite available on http://tools.iedb.org/ellipro was employed . The suite utilizes a minimum threshold of 0.5 and maximum distance of 6 to map epitopes. Additionally, it utilizes both the MODELLER program and Jmol viewer to evaluate and visualize the three-dimensional (3D) structure of B-cell epitopes within the vaccine construct [20].
Refinement of predicted tertiary structure
Refinement is done by subjecting the vaccine candidate to Galaxy refinement tool GalaxyWEB. Galaxy refine improves the tertiary structure of the vaccine candidate by increasing the number of residues positioned within the favored region, hence optimizing the structure for improved efficiency and to achieve a more favorable conformation. Galaxy refine tool achieved this by molecular dynamics and rearrangement of side-chains of the polypeptides [21].
Model stability and validity evaluation
Validating the refined tertiary structure of the candidate vaccine was done on ProSA server (https://prosa.services.came.sbg.ac.at/prosa.php), PROCHEK and ERRAT server (https://saves.mbi.ucla.edu/). ProSA server identifies potential error by comparing the predicted structure with experimental structures, thereby providing a z-score of the predicted structure, two graphical representations of the validation are also provided. The PROCHEK and ERRAT servers validate protein tertiary structures by using the Ramachandran plot and statistical evaluation of nonbonding interactions between different atoms by comparing them with databases [22].
Disulfide engineering for vaccine stability
Disulfide bond are covalent bonds interaction that stabilizes molecular structures by confirming specific geometric conformations to the biological 3D structures. Disulfide bonds for our vaccine model was done on the Disulfide by Design 2.0 server (http://cptweb.cpt.wayne.edu/DbD2/), the server search for residue pairs that can form disulfide bonds on the bases of their X3 value, energy and B-factor score. Two residue pairs were accepted to form the mutant, the PyMOL visualizing tool was utilized in viewing the binding site [23].
Receptor ligand molecular docking analysis
To determine the interactions between the ligands and receptors of a protein, molecular docking analysis should be carried out. The vaccine candidate which is the ligand in our research was docked against TLR4, a known Toll-like receptor (TLR) reported to be overexpressed in colorectal cancer and is also suggested to be directly correlated with patient’s survival [24]. Docking was achieved on the PatchDock online docking tool (https://bioinfo3d.cs.tau.ac.il/PatchDock/). Visualization of the bound complex was carried out on PyMOL software, the interacting ligand-receptor complex between the TLR4 and vaccine residue was viewed with LigPlot+ v.2.2.5 tool.
RESULTS
Screening for CTL, HTL, and B cells from the conserved epitopes
Seventeen 9mers high comb scoring CLTs epitopes were selected from the predicted epitopes from the conserved Chikungunya virus structural epitopes, this are listed on Table 2. Thirty-two 15mers helper T-lymphocytes (HTC) epitopes were also selected from the total predicted HTC epitopes on the bases of their low percentile rank score (Table 3) and 16 B-cell epitopes were chosen from the total B-cell epitopes predicted from the virus’s structural protein based on their high score (Table 4). The presence of conformational epitopes in our research (Table 5) indicates the availability of B-cell epitopes since B-cell epitopes which are approximately 90% conformational epitopes. Conformational epitopes are crucial in peptide-based vaccines thus making their availability an advantage for the candidate vaccine.
Construction of primary structure of vaccine candidate
A total of 1,236 amino acid residues comprising the selected CTLs, HTC, and B cells linked together by suitable linkers were used to construct the primary vaccine candidate. Linking an adjuvant to the N-terminal of the candidate vaccine improves its antigenicity. Fig. 1 shows a graphical representation of the primary vaccine structure.
Antigenicity, allergenicity, toxicity, and physicochemical screening of the constructed primary vaccine candidate
The vaccine candidate was predicted to be nontoxic, nonallergenic and have an antigenicity score of 0.873038. The physiochemical assessment of the primary structure revealed that the sequence has a molecular weight of 240493.64, a total number of negative charge residue (Asp+Glu) score of 141, and a score of 193 as the total number of positive charge (Arg+Lys). The sequence has a 30-hour half-life in mammalian reticulocyte, >20 hours in yeast and >10 hours Escherichia coli in vivo. The aliphatic index was 66.73, instability index was 40.35, which classify the protein as unstable and grand average of hydropathicity is −0.354.
Candidate vaccine secondary structure prediction
SOPMA online prediction tool provides both the secondary structure of the candidate vaccine and the alpha helix, extended strand, random coil, and beta turns for the vaccine construct as 15.80%, 27.05%, 50.82%, and 6.33%, respectively.
Prediction of 3D structure
Swiss Model 3D structure predicting server predicted 50 models through the homology modeling technique, model 1 (Fig. 2) was selected on the bases of the resolution of the structure 1.55 Å, quaternary structure quality estimation of 0.00 which should range between 0 and 1, which, for a model constructed using a given alignment and template, is a number between 0 and 1 that represents the expected accuracy of the inter-chine contact with sequence similarity of 0.26, and coverage of 0.16.
Refinement of predicted tertiary structure
The vaccine tertiary structure refinement provided five models, the model 1 (Fig. 3) was selected on the criteria as listed in Table 6.
Model stability and validation evaluation
The result obtained for 3D structure validation on ProSA calculated a z-score of −4.03 while ERRAT predicted an overall quality score of 63.9037%, and the graph of the vaccine local model quality revealed some potential low-quality regions within the sequence. VERIFY, a 3D validation tool on PROCHEK, predicted an overall G-factor score of −0.26, the Ramachandran plot analysis predicted an 87.5% favored region, 10.9% allowed region, and a 0.8% disallowed region (Fig. 4).

(A) The figure shows a Z-score plot of the tertiary structure of the candidate vaccine, the vaccine is located in the X-ray crystallography region. (B) The local model quality plot revealed the quality of the candidate vaccine based on the position and energy function of the vaccine amino acids sequences. (C) The ERRAT plot reveals normal and faulty regions of the vaccine peptide sequence, these are seen as white, yellow, and red bars, respectively. (D) The Ramachandran plot indicating the percentages of the amino acid in the favored, allowed and disallowed regions as 87.5%, 10.9%, and 0.8%, respectively.
Vaccine disulfide engineering
To stabilize the structure of the candidate vaccine, disulfide engineering was performed. Forty-one pairs of amino acid residues were predicted to be useable for disulfide engineering, evaluating the residue pairs on the threshold of energy, Chi3 value and high B-factor, only two pairs of amino acid residues were finalized (GLY1561–GLY1701, TYR1769-ASN1810), this was carried out on Disulfide by Design v2.0 server (http://cptweb.cpt.wayne.edu/DbD2/index.php) (Fig. 5).
DISCUSSION
In this study, we sought to design novel multiepitope vaccine biomolecule that targets colorectal cancer using bioinformatic and immunoformatic approaches to linking CEA and IGF-1R epitopes. The prime rationale in this approach is based on CEA and IGF-1R roles in resistance to targeted therapies, involvement in growth of malignancy and their oncological biomarker status [25,26]. Multiepitope vaccines as a therapeutic tool is becoming progressively accepted because of their efficiency and nontoxic property when paralleled to other treatment modalities [27].
To achieve this, CEA and IGF-1R sequences were selected and retrieved from the NCBI database. The selection was based on their antigenic potential which was predicted on the Scratch database. To develop the vaccine, conserve regions of the retrieved CEA and IGF-1R sequences were screened for and selected to ensure that the vaccine would be active for large variant of the cancer type and would still be active even if mutation should occur [28,29]. Analysis of the conserved sequence for antigenicity revealed that the sequences maintained the antigenic criterion of selection as the original CEA and IGF-1R sequences, hence possessing the potential of an immune stimulant.
To determine their immune-stimulating properties the conserved sequences were screened for CTLs, HTLs, and B cells inducing epitopes as a vaccine should be able of inducing these immune components to initiate humoral and cellular immune response. Epitopes that can initiate the production of these immune components were linked together using suitable linkers, adjuvant was also linked with the epitopes at the N-terminal to enhance the antigenic potential [18]. The complex now formed makes up the in silico primary vaccine structure.
It was also necessary to subject the vaccine’s primary structure to antigenic and allergenic screen as was carried out using the Scratch online tool. Screening result predicted that the vaccine’s primary structure preserved the antigenic potential while remaining nonallergenic. Physicochemical properties prediction also showed that the vaccine properties fell within the accepted threshold [30].
Vaccine structural stability is essential as the vaccine should be able to withstand biological, biochemical and metabolic pressure of the host system, we evaluate the vaccine structural stability by predicting the secondary structure of the vaccine, result obtained showed that the structure fall within accepted threshold, this was on the premise of the alpha helix and beta sheet percentages.
The tertiary structure of a polypeptide plays a major role in its function. We predicted the vaccine tertiary structure by homology modeling. This method predicts the tertiary structure of an unknown polypeptide by comparing it with known structures on the database based on the similarity of their sequences.
In order to validate the predicted tertiary structure, the PDB file was subjected to statistical analysis, a z-score of −4.03, and a Ramachandran plot with 87.5% in the most favored region was obtained. This result is satisfactory as it indicates a peptide that conforms to native protein.
Docking the vaccine peptide with the TLR4 molecule was done to evaluate the vaccine’s ability to bind to TLR4. Binding an antigen with a TLR4 activates inflammatory response and innate immune system. The vaccine polypeptide in this study docked with TLR4 with a binding energy of −317.92 kcal mol−1. The low binding energy indicates a high binding affinity with human TLR4 (Fig. 6).

(A) Model 1 was selected based on the Docking score (−317.92). This was the prediction with the highest negative value indicating its higher chances of docking. Other parameters include Confidence score of 0.9664 and Ligand RMSD (Å) of 139.72. (B) Interaction between Toll-like receptor 4 (TLR4) and the vaccine candidate viewed using LigPlot+ v.2.2.5 app, TLR4 amino acid residues are indicated in blue while the candidate vaccine amino acid residues are indicted in green.
In conclusion, using an immunoinformatics approach, we identified epitopes that have potential as a multiepitope-based subunit vaccine against cancer. CTL, HTL, and B-cell lymphomas were screened and selected for vaccine design. Finally, the vaccine constructs consist of CTL, HTL, and B-cell epitopes, an adjuvant to elicit strong immune responses and fusion protein linkers. Conservancy analysis of the promiscuous epitopes was used to select epitopes with robust immune response against large variants or mutants of cancer. Molecular docking revealed stronger binding affinity with human TLR4. Our computational analysis suggests that the constructed vaccine is highly immunogenic, safe, and stable in nature.
Notes
CONFLICT OF INTEREST
No potential conflict of interest relevant to this article was reported.
FUNDING
None.
ACKNOWLEDGMENTS
The author wishes to Mr. Solomon Alile of the University of Benin for his technical support and advice.