Token Classification
Transformers
PyTorch
TensorBoard
bert
Generated from Trainer
Eval Results (legacy)
Instructions to use EMBO/sd-panelization-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use EMBO/sd-panelization-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="EMBO/sd-panelization-v2")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("EMBO/sd-panelization-v2") model = AutoModelForTokenClassification.from_pretrained("EMBO/sd-panelization-v2") - Notebooks
- Google Colab
- Kaggle
Dr. Jorge Abreu Vicente commited on
Commit ·
f8f191b
1
Parent(s): 307014a
update model card README.md
Browse files
README.md
CHANGED
|
@@ -19,53 +19,23 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 19 |
|
| 20 |
This model was trained from scratch on the source_data_nlp dataset.
|
| 21 |
It achieves the following results on the evaluation set:
|
| 22 |
-
- Loss: 0.
|
| 23 |
-
- Accuracy Score: 0.
|
| 24 |
-
- Precision: 0.
|
| 25 |
-
- Recall: 0.
|
| 26 |
-
- F1: 0.
|
| 27 |
|
| 28 |
## Model description
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
## Intended uses & limitations
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
It will not load with the default HuggingFace pipelines. Must be load in the following way,
|
| 37 |
-
after installing the [soda-roberta](https://github.com/source-data/soda-roberta) library:
|
| 38 |
-
|
| 39 |
-
```python
|
| 40 |
-
from smtag.excell_roberta.modeling_excell_roberta import EXcellRobertaForTokenClassification
|
| 41 |
-
from transformers import AutoTokenizer
|
| 42 |
-
from datasets import load_dataset
|
| 43 |
-
|
| 44 |
-
ds = load_dataset("EMBO/sd-nlp-non-tokenized","PANELIZATION")
|
| 45 |
-
SENTENCE = """Figure 2A. HEK293T cells were transfected with MYC-FOXP3 and FLAG-USP44 encoding expression constructs using Polyethylenimine. 48hrs post-transfection, cells were harvested, lysed, and anti-FLAG or anti-MYC antibody coated beads were used to immunoprecipitate the given labeled protein along with its bi\nnding partner. Co-IP' ed proteins were subjected to SDS
|
| 46 |
-
PAGE followed by immunoblot analysis. Antibodies recognizing FLAG or MYC tags were used to probe for USP44 and FOXP3, respectively. B. Endogenous co-IP of USP44 and FOXP3 in murine iTregs. iTregs were generated as in Fig. 1 from naïve CD4+T cells FACS isolated from pooled suspensions of the lymph node and\n spleen cells of wild type C57BL/6 mice (n = 2-3 / exp
|
| 47 |
-
eriment). iTregs were lysed and key proteins were immunoprecipitated using either anti-USP44 (right panel) or anti-FOXP3 (left panel) antibody. Proteins pulled-down in this experiment were then resolved and analyzed by immunoblot using anti-FOXP3 or anti-USP44 antibodies. C. Endogenous co-IP of USP44 and FO\nXP3 in murine nTregs. nTregs (CD4+CD25high) isolated
|
| 48 |
-
by FACS were activated by anti-CD3 and anti-CD28 (1 and 4 ug/ml, respectively) overnight in the presence of IL-2 (100 U/ml). The cells were lysed and proteins were immunoprecipitated using either anti-Foxp3 (left panel) or anti-Usp44 (right panel). Proteins pulled down in this experiment were then resolved a\nnd identified with the indicated antibodies. D . N
|
| 49 |
-
aïve murine CD4+T cells were isolated by FACS from lymph node and spleen cell suspension of USP44fl/fl CD4Cre+ mice and that of their wild type littermates (USP44fl/fl CD4Cre-mice; n = 2-3 / group / experiment) . iTreg cells were generated from these mice as described for Fig. 1 before incubation on a microscop\ne slide pre-coated with poly-L lysine for 1h. Ad
|
| 50 |
-
hered cells were then fixed by PFA for 0.5 followed by blocking with 1% BSA for 1h, then incubation with the specified antibodies. Representative confocal microscopy images (40X) were visualized for endogenous USP44 (red) and FOXP3 Baxter et al (). DAPI was used to visualize cell nuclei (blue); scale bar 50μm."""
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
model = EXcellRobertaForTokenClassification.from_pretrained("EMBO/sd-panelization-v2")
|
| 54 |
-
tokenizer = AutoTokenizer.from_pretrained("EMBO/sd-panelization-v2", is_pretokenized=False, add_prefix_space=True)
|
| 55 |
-
|
| 56 |
-
outputs = model(**tokenizer(SENTENCE, return_tensors="pt"))
|
| 57 |
-
|
| 58 |
-
logits = outputs[0].cpu() # B x L H
|
| 59 |
-
proba = logits.softmax(-1) # B x L x H
|
| 60 |
-
labels = logits.argmax(-1) # B x L
|
| 61 |
-
|
| 62 |
-
for label, token in zip(labels[0], tokenizer(SENTENCE, return_tensors="pt")["input_ids"][0]):
|
| 63 |
-
print(f"{model.id2label.get(label.item())}\t{tokenizer.decode(token)}")
|
| 64 |
-
```
|
| 65 |
|
| 66 |
## Training and evaluation data
|
| 67 |
|
| 68 |
-
|
| 69 |
|
| 70 |
## Training procedure
|
| 71 |
|
|
@@ -78,14 +48,13 @@ The following hyperparameters were used during training:
|
|
| 78 |
- seed: 42
|
| 79 |
- optimizer: Adafactor
|
| 80 |
- lr_scheduler_type: linear
|
| 81 |
-
- num_epochs:
|
| 82 |
|
| 83 |
### Training results
|
| 84 |
|
| 85 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy Score | Precision | Recall | F1 |
|
| 86 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:---------:|:------:|:------:|
|
| 87 |
-
| 0.
|
| 88 |
-
| 0.0049 | 2.0 | 432 | 0.0064 | 0.9982 | 0.9689 | 0.9905 | 0.9795 |
|
| 89 |
|
| 90 |
|
| 91 |
### Framework versions
|
|
|
|
| 19 |
|
| 20 |
This model was trained from scratch on the source_data_nlp dataset.
|
| 21 |
It achieves the following results on the evaluation set:
|
| 22 |
+
- Loss: 0.0118
|
| 23 |
+
- Accuracy Score: 0.9970
|
| 24 |
+
- Precision: 0.9524
|
| 25 |
+
- Recall: 0.9865
|
| 26 |
+
- F1: 0.9691
|
| 27 |
|
| 28 |
## Model description
|
| 29 |
|
| 30 |
+
More information needed
|
| 31 |
|
| 32 |
## Intended uses & limitations
|
| 33 |
|
| 34 |
+
More information needed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
## Training and evaluation data
|
| 37 |
|
| 38 |
+
More information needed
|
| 39 |
|
| 40 |
## Training procedure
|
| 41 |
|
|
|
|
| 48 |
- seed: 42
|
| 49 |
- optimizer: Adafactor
|
| 50 |
- lr_scheduler_type: linear
|
| 51 |
+
- num_epochs: 1.0
|
| 52 |
|
| 53 |
### Training results
|
| 54 |
|
| 55 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy Score | Precision | Recall | F1 |
|
| 56 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:---------:|:------:|:------:|
|
| 57 |
+
| 0.0078 | 1.0 | 216 | 0.0118 | 0.9970 | 0.9524 | 0.9865 | 0.9691 |
|
|
|
|
| 58 |
|
| 59 |
|
| 60 |
### Framework versions
|