|
|
--- |
|
|
language: fa |
|
|
pipeline_tag: token-classification |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# QomSSLab/Verdict_Splitter |
|
|
|
|
|
This repository hosts an XLM-RoBERTa token-classification head trained. |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline |
|
|
|
|
|
model_id = "QomSSLab/Verdict_Splitter" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForTokenClassification.from_pretrained(model_id) |
|
|
tagger = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple") |
|
|
|
|
|
text = "مثال از یک ورودی فارسی" |
|
|
for entity in tagger(text): |
|
|
print(entity) |
|
|
``` |
|
|
|
|
|
## Labels |
|
|
|
|
|
- `O` |
|
|
- `استدلال` |
|
|
- `تصمیم` |
|
|
- `خارج` |
|
|
- `خلع` |
|
|
- `مقدمه` |
|
|
- `پایانی` |
|
|
|
|
|
## Metrics |
|
|
|
|
|
## Validation Metrics |
|
|
|
|
|
- Precision: 0.7067 |
|
|
- Recall: 0.8457 |
|
|
- F1: 0.7700 |
|
|
- Accuracy: 0.9730 |
|
|
|
|
|
### Per-label Breakdown |
|
|
|
|
|
| Label | Precision | Recall | F1 | Support | |
|
|
| --- | --- | --- | --- | --- | |
|
|
| O | 0.8617 | 0.8223 | 0.8416 | 394 | |
|
|
| استدلال | 0.9733 | 0.9394 | 0.9561 | 6635 | |
|
|
| تصمیم | 0.9895 | 0.9700 | 0.9797 | 5361 | |
|
|
| خارج | 1.0000 | 1.0000 | 1.0000 | 0 | |
|
|
| خلع | 1.0000 | 1.0000 | 1.0000 | 0 | |
|
|
| مقدمه | 0.9689 | 0.9981 | 0.9833 | 10871 | |
|
|
| پایانی | 0.9722 | 0.9879 | 0.9800 | 1732 | |
|
|
|
|
|
|
|
|
|