AmishaG commited on
Commit
3772714
·
verified ·
1 Parent(s): 9d6c259

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -16
README.md CHANGED
@@ -4,48 +4,60 @@ language:
4
  - en
5
  base_model:
6
  - facebook/esm2_t33_650M_UR50D
 
 
 
 
 
 
 
7
  ---
8
- # ANTICP3: Anticancer Protein Prediction using ESM2
9
 
10
- This repository hosts the fine-tuned **ESM2-based classifier** for **anticancer protein (ACP) prediction**. The model is built on top of [facebook/esm2_t33_650M_UR50D](https://huggingface.co/facebook/esm2_t33_650M_UR50D), and it performs binary classification to predict whether a given protein sequence is anticancer.
 
 
 
 
 
 
11
 
12
  ---
13
 
14
  ## Model Details
15
 
16
- - **Base Model:** `facebook/esm2_t33_650M_UR50D`
17
- - **Task:** Binary Sequence Classification
18
- - **Labels:**
19
- - `0`: Non-Anticancer
20
- - `1`: Anticancer
21
- - **Framework:** [Transformers](https://huggingface.co/docs/transformers/index)
22
- - **Format:** `Safetensors`
 
23
 
24
  ---
25
 
26
- ## 🚀 Usage
27
 
28
- You can load and use this model directly with the `transformers` library:
29
 
30
  ```python
31
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
32
  import torch
33
 
34
- # Load tokenizer and model
35
  tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t33_650M_UR50D")
36
  model = AutoModelForSequenceClassification.from_pretrained("AmishaG/anticp3")
37
 
38
- # Example input sequence
39
  sequence = "MANCVVGYIGERCQYRDLKWWELRGGGGSGGGGSAPAFSVSPASGLSDGQSVSVSVSGAAAGETYYIAQCAPVGGQDACNPATATSFTTDASGAASFSFVVRKSYTGSTPEGTPVGSVDCATAACNLGAGNSGLDLGHVALTFGGGGGSGGGGSDHYNCVSSGGQCLYSACPIFTKIQGTCYRGKAKCCKLEHHHHHH"
40
 
41
- # Tokenize
42
  inputs = tokenizer(sequence, return_tensors="pt", truncation=True)
43
 
44
- # Run inference
45
  with torch.no_grad():
46
  logits = model(**inputs).logits
47
  probs = torch.nn.functional.softmax(logits, dim=-1)
48
  prediction = torch.argmax(probs, dim=1).item()
49
 
50
  labels = {0: "Non-Anticancer", 1: "Anticancer"}
51
- print("Prediction:", labels[prediction])
 
4
  - en
5
  base_model:
6
  - facebook/esm2_t33_650M_UR50D
7
+ tags:
8
+ - protein-classification
9
+ - bioinformatics
10
+ - anticancer
11
+ - esm2
12
+ - transformers
13
+ - torch
14
  ---
 
15
 
16
+ # ANTICP3: Anticancer Protein Prediction
17
+
18
+ This model is a fine-tuned version of [`facebook/esm2-t33-650M-UR50D`](https://huggingface.co/facebook/esm2_t33_650M_UR50D) designed for **binary classification of anticancer proteins (ACPs)** from their primary sequence.
19
+
20
+ > **Developed by**: [G. P. S. Raghava Lab, IIIT-Delhi](https://webs.iiitd.edu.in/raghava/)
21
+ >
22
+ > **Model hosted by**: [Dr. GPS Raghava's Group](https://huggingface.co/raghavagps-group/anticp3)
23
 
24
  ---
25
 
26
  ## Model Details
27
 
28
+ | Feature | Description |
29
+ |--------------------|--------------------------------------------------------------|
30
+ | **Base Model** | [`facebook/esm2_t33_650M_UR50D`](https://huggingface.co/facebook/esm2_t33_650M_UR50D) |
31
+ | **Fine-tuned On** | Anticancer Protein Dataset |
32
+ | **Model Type** | Binary Classification |
33
+ | **Labels** | `0`: Non-Anticancer<br>`1`: Anticancer |
34
+ | **Framework** | [Transformers](https://huggingface.co/docs/transformers) + PyTorch |
35
+ | **Format** | `safetensors` |
36
 
37
  ---
38
 
39
+ ## Usage
40
 
41
+ Use this model with the Hugging Face `transformers` library:
42
 
43
  ```python
44
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
45
  import torch
46
 
47
+ # Load tokenizer and fine-tuned model
48
  tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t33_650M_UR50D")
49
  model = AutoModelForSequenceClassification.from_pretrained("AmishaG/anticp3")
50
 
51
+ # Example protein sequence
52
  sequence = "MANCVVGYIGERCQYRDLKWWELRGGGGSGGGGSAPAFSVSPASGLSDGQSVSVSVSGAAAGETYYIAQCAPVGGQDACNPATATSFTTDASGAASFSFVVRKSYTGSTPEGTPVGSVDCATAACNLGAGNSGLDLGHVALTFGGGGGSGGGGSDHYNCVSSGGQCLYSACPIFTKIQGTCYRGKAKCCKLEHHHHHH"
53
 
54
+ # Tokenize and run inference
55
  inputs = tokenizer(sequence, return_tensors="pt", truncation=True)
56
 
 
57
  with torch.no_grad():
58
  logits = model(**inputs).logits
59
  probs = torch.nn.functional.softmax(logits, dim=-1)
60
  prediction = torch.argmax(probs, dim=1).item()
61
 
62
  labels = {0: "Non-Anticancer", 1: "Anticancer"}
63
+ print("Prediction:", labels[prediction])