Pcleavage is a Support Vector Machine (SVM)-based computational method developed for predicting:
- Constitutive proteasome cleavage sites
- Immunoproteasome cleavage sites
in antigenic protein sequences.
The method predicts cleavage positions generated during intracellular protein degradation and antigen processing pathways important for MHC class I presentation and T-cell epitope generation.
Pcleavage uses:
- Support Vector Machine (SVM)
- PEBLS (Parallel Exemplar-Based Learning)
- Weka machine learning algorithms
The web server was developed to provide user-friendly prediction of proteasomal cleavage sites for immunoinformatics and vaccine design applications.
Pcleavage: an SVM based method for prediction of constitutive proteasome and immunoproteasome cleavage sites in antigenic sequences
- Manoj Bhasin
- G. P. S. Raghava
Nucleic Acids Research
33
Web Server Issue
W202–W207
2005
https://doi.org/10.1093/nar/gki587
Proteasomes are cellular protein complexes responsible for intracellular protein degradation.
They play major roles in:
- Protein turnover
- Antigen processing
- Generation of MHC class I ligands
- T-cell epitope generation
There are two major forms of proteasomes:
Present in normal cells and involved in general protein degradation.
Activated by interferon-gamma and involved in generating peptides for MHC class I presentation.
Prediction of proteasome cleavage sites is important for:
- Vaccine design
- Immunoinformatics
- T-cell epitope prediction
- Antigen processing analysis
The study aimed to:
- Predict proteasome cleavage sites in protein sequences
- Develop classifiers for constitutive proteasomes
- Develop classifiers for immunoproteasomes
- Improve antigen processing prediction
- Create an accessible web server for researchers
Proteasome cleavage data were collected from:
- Yeast enolase I
- β-casein digestion studies
Cleavage residues were assigned as:
- P1 cleavage sites
The MHC ligand dataset was collected from:
- MHCBN database
- 1288 HLA-A and HLA-B restricted ligands
- Final processed dataset:
- 506 ligands
- Derived from more than 250 proteins
Natural MHC ligands were assumed to contain major cleavage sites at their C-termini.
Independent datasets were obtained from:
- Saxova et al.
The dataset included:
- SSX-2 protein
- HIV1-Nef protein
- RUI protein
- 231 unique ligands
- Derived from 135 proteins
SVM classifiers were implemented using:
- SVM_light
Each amino acid was encoded using:
- 21-dimensional binary representation
Window sizes:
- 7 amino acids for in vitro digestion data
- 19 amino acids for MHC ligand data
Parallel Exemplar-Based Learning was used as a nearest-neighbor learning system for symbolic feature analysis.
The following Weka algorithms were evaluated:
- Logistic Regression
- Naive Bayes
- J48.PART
Cost-sensitive classification was applied because of imbalanced datasets.
The models were evaluated using:
- Five-fold cross-validation
Performance metrics included:
- Sensitivity
- Specificity
- Accuracy
- Matthew’s Correlation Coefficient (MCC)
| Kernel | Sensitivity | Specificity | Accuracy | MCC |
|---|---|---|---|---|
| RBF | 86.4% | 50.7% | 68.6% | 0.42 |
| Polynomial | 84.6% | 55.6% | 70.0% | 0.43 |
| Kernel | Sensitivity | Specificity | Accuracy | MCC |
|---|---|---|---|---|
| RBF | 84.3% | 69.0% | 76.7% | 0.54 |
| Polynomial | 86.2% | 65.4% | 75.8% | 0.53 |
The SVM classifier outperformed:
- PEBLS
- Naive Bayes
- J48.PART
- Logistic Regression
| Metric | Value |
|---|---|
| Sensitivity | 86.9% |
| Specificity | 60.9% |
| Accuracy | 68.0% |
| MCC | 0.43 |
| Metric | Value |
|---|---|
| Sensitivity | 82.3% |
| Specificity | 45.0% |
| Accuracy | 63.9% |
| MCC | 0.29 |
Threshold-independent ROC analysis demonstrated:
| Method | AUC |
|---|---|
| Pcleavage | 0.790 |
| NetChop | 0.805 |
| Method | AUC |
|---|---|
| Pcleavage | 0.615 |
| NetChop | 0.609 |
The performance of Pcleavage was comparable to NetChop.
The Pcleavage server allows users to:
- Submit antigenic protein sequences
- Predict constitutive proteasome cleavage sites
- Predict immunoproteasome cleavage sites
- Select prediction thresholds
- Upload sequence files
- Visualize cleavage positions
Supported formats include:
- FASTA
- EMBL
- GCG
- Plain text
The server provides:
- Cleavage site positions
- Prediction scores
- Cleavage/non-cleavage state
- Graphical sequence mapping
Cleavage residues are displayed in:
- Larger red-colored letters
Pcleavage can be used for:
- T-cell epitope prediction
- Vaccine design
- Immunoinformatics
- Antigen processing analysis
- Proteasome cleavage analysis
- Computational immunology
- Support Vector Machine (SVM)
- SVM_light
- PEBLS
- Weka
- Machine Learning
- Binary Sequence Encoding
- ROC Analysis
The study demonstrated that:
- SVM-based classifiers outperform traditional methods
- Proteasome cleavage prediction is feasible using sequence patterns
- Pcleavage performs comparably to NetChop
- MHC ligand data improve immunoproteasome prediction
Pcleavage provides an efficient computational framework for predicting constitutive proteasome and immunoproteasome cleavage sites in antigenic proteins.
The developed SVM-based models achieved strong performance and provide a valuable resource for:
- Vaccine design
- Antigen processing analysis
- T-cell epitope identification
- Immunoinformatics research
http://www.imtech.res.in/raghava/pcleavage/
Mirror Server:
http://bioinformatics.uams.edu/mirror/pcleavage/
Email: raghava@iiitd.ac.in
Address:
Indraprastha Institute of Information Technology Delhi
This project is intended for academic and research purposes only.