QomSSLab's picture
Upload README.md with huggingface_hub
88124cd verified
---
language: fa
library_name: transformers
tags:
- classification
- legal
- iranian-legal
- persian
- case-type
pipeline_tag: text-classification
---
# QomSSLab/CaseTypeClassifier-fa
**QomSSLab/CaseTypeClassifier-fa** is a Persian legal text classifier that predicts whether a court ruling (رأی) belongs to a **civil (حقوقی)** or **criminal (کیفری)** category.
The model is designed for use in Iranian legal NLP pipelines, document organization, and downstream analysis of judicial data.
## 💡 Use Cases
- Automatic classification of Persian court rulings into civil or criminal categories.
- Preprocessing step for legal analytics and document retrieval systems.
- Assisting legal researchers and developers in structuring Persian legal corpora.
## 🧠 Model Details
- **Language**: Persian (Farsi)
- **Task**: Text Classification
- **Classes**: `civil` (حقوقی), `criminal` (کیفری)
- **Pipeline Tag**: `text-classification`
## 📦 Example Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "QomSSLab/CaseTypeClassifier-fa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "در این پرونده متهم به سرقت اموال عمومی محکوم شده است."
result = classifier(text)
print(result)
```
Example Output:
```python
[
{'label': 'کیفری', 'score': 0.9969141483306885}
]
```
## 📊 Evaluation
The model was trained and evaluated on a balanced dataset of Persian court rulings.
It demonstrates high accuracy in distinguishing civil and criminal judgments.
| Metric | Value |
|:-------|:------:|
| **Training Loss** | 0.0358 |
| **Validation Loss** | 0.033996 |
| **Accuracy** | **0.9951** |
| **F1 Score** | **0.9951** |
| **Precision** | **0.9951** |
| **Recall** | **0.9951** |
**Final Performance:** The model achieved **99.51% accuracy** and **0.9951 F1-score** on the validation set.
### Limitations
- Performance may degrade on highly abbreviated or informal texts.
- Designed primarily for Iranian legal language; may not generalize to non-Iranian legal contexts.
- Does not classify subtypes (e.g., family, property, or financial cases).