|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
- bn |
|
|
metrics: |
|
|
- f1 |
|
|
- accuracy |
|
|
base_model: |
|
|
- google-bert/bert-base-multilingual-cased |
|
|
--- |
|
|
# EN-BN Translation Error Detection Model |
|
|
|
|
|
This model detects translation errors in English-Bangla translations. |
|
|
|
|
|
## Model Architecture |
|
|
- Base: BERT multilingual |
|
|
- Fine-tuned for multi-label classification of translation errors |
|
|
- Labels: Semantic Error, Cultural Error, Literal Translation Error, Syntactical Error, No Error |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("SamiaHaque/ENBNErrorDetector") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("SamiaHaque/ENBNErrorDetector") |
|
|
|
|
|
# Prepare input |
|
|
text = "Your source and translation text here..." |
|
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) |
|
|
|
|
|
# Get predictions |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
predictions = torch.sigmoid(outputs.logits) |
|
|
``` |