Biocoder09 commited on
Commit
e8c0867
ยท
verified ยท
1 Parent(s): ecd5b53

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -108
README.md DELETED
@@ -1,108 +0,0 @@
1
- # ๐Ÿงฌ CANLoc โ€” Protein Subcellular Localization Predictor
2
-
3
- CANLoc is a production-ready machine learning web application for predicting the **subcellular localization of proteins** directly from amino acid sequences.
4
- It provides accurate, fast, and interpretable predictions through a modern deep-learningโ€“assisted pipeline and an interactive web interface.
5
-
6
- ---
7
-
8
- ## ๐Ÿ”ฌ Model Overview
9
-
10
- CANLoc combines:
11
-
12
- - **ESM2 (Transformer-based protein language model)**
13
- Used for extracting rich sequence embeddings without alignment.
14
-
15
- - **Mean pooling of residue embeddings**
16
- Produces fixed-length feature vectors.
17
-
18
- - **XGBoost classifier**
19
- Trained on curated protein datasets for robust multiclass prediction.
20
-
21
- ### Predicted Classes
22
- - Cytoplasm
23
- - Nucleus
24
- - Membrane
25
- - Mitochondria
26
-
27
- Each prediction includes **class probabilities** and **confidence visualization.**
28
-
29
- ---
30
-
31
- ## ๐Ÿ“Š Features
32
-
33
- - Single sequence prediction
34
- - Batch prediction via FASTA file upload
35
- - Probability bar chart and radar plot
36
- - Confidence-based interpretation
37
- - Clean, responsive bioinformatics-style UI
38
- - Dockerized for reproducible deployment
39
- - FastAPI backend + modern frontend
40
-
41
- ---
42
-
43
- ## ๐Ÿงช Input Formats
44
-
45
- ### Single Sequence
46
- Paste a raw amino acid sequence: MVKFKKYGIP...
47
-
48
-
49
- ### FASTA File
50
- Upload a standard FASTA file with one or multiple sequences:
51
- sp|P25296|CANB_YEAST
52
- MSLIHPDTAKYPFKFEPF...
53
-
54
-
55
- ---
56
-
57
- ## ๐Ÿ“ˆ Output Interpretation
58
-
59
- - **Predicted Location**
60
- The most probable subcellular class.
61
-
62
- - **Class Probabilities**
63
- Displayed as percentages for all four classes.
64
-
65
- - **Confidence Levels**
66
- - High: โ‰ฅ 75%
67
- - Medium: 60โ€“75%
68
- - Low: < 60% (interpret with caution)
69
-
70
- ---
71
-
72
- ## โš™๏ธ Evaluation & Validation
73
-
74
- The model was evaluated using:
75
- - Train/test split
76
- - 10-fold stratified cross-validation
77
- - Precision, recall, F1-score
78
- - Sensitivity and specificity analysis
79
- - ROC curves per class
80
-
81
- These evaluations confirm CANLocโ€™s reliability for academic/research workflows..
82
-
83
-
84
- ---
85
-
86
- ## ๐Ÿš€ Deployment
87
-
88
- CANLoc is containerized and deployed using **Docker** and **Railway**.
89
-
90
- ## ๐Ÿ“„ License
91
- This project is licensed under the Apache License 2.0.
92
-
93
- >Free for academic and commercial use
94
- >Includes patent protection
95
- >No restrictions on deployment or modification
96
-
97
- See the LICENSE file for details.
98
-
99
-
100
- ## ๐Ÿ“ฌ Contact
101
-
102
- For questions, bug report or feedback:
103
- majidkhan>jssmsc@gmail.com
104
-
105
- ## ๐Ÿ“Œ Citation
106
-
107
- If you use CANLoc in academic work, please cite appropriately.
108
-