|
|
--- |
|
|
library_name: transformers |
|
|
license: mit |
|
|
datasets: |
|
|
- hapaxlegomenon/InferBR |
|
|
language: |
|
|
- pt |
|
|
base_model: |
|
|
- neuralmind/bert-large-portuguese-cased |
|
|
--- |
|
|
# Model Card: BERT-Large-Portuguese-Cased Fine-Tuned on InferBR NLI |
|
|
|
|
|
## Model Details |
|
|
- **Model name:** `felipesfpaula/bertimbau-large-InferBr-NLI` |
|
|
- **Base model:** `neuralmind/bert-large-portuguese-cased` |
|
|
- **Task:** Natural Language Inference (NLI) on Brazilian Portuguese |
|
|
- **Dataset:** [InferBR](https://huggingface.co/datasets/hapaxlegomenon/InferBR) |
|
|
- Premise–Hypothesis pairs in Portuguese |
|
|
- Label mapping: |
|
|
- 0 – Contradiction |
|
|
- 1 – Entailment |
|
|
- 2 – Neutral |
|
|
|
|
|
## Intended Use |
|
|
This model is intended for research and applications requiring Portuguese NLI, such as: |
|
|
- Automated textual reasoning in Portuguese |
|
|
- Downstream tasks: question answering, summarization consistency checks, semantic search |
|
|
- Academic experiments in Portuguese natural language understanding |
|
|
|
|
|
**Not intended for:** |
|
|
- Sensitive decision-making without human oversight |
|
|
- Use on texts in languages other than Brazilian Portuguese |
|
|
|
|
|
## Training Data |
|
|
- **Training split:** InferBR “train” (premise, hypothesis, label) |
|
|
- **Validation split:** InferBR “validation” |
|
|
- **Test split:** InferBR “test” |
|
|
- **Preprocessing:** |
|
|
- Tokenized with `neuralmind/bert-large-portuguese-cased` tokenizer |
|
|
- Maximum sequence length: 128 tokens |
|
|
- Padding to max length |
|
|
- Labels cast to integer IDs `{0,1,2}` |
|
|
|
|
|
## Training Procedure |
|
|
- **Fine-tuned on:** `neuralmind/bert-large-portuguese-cased` |
|
|
- **Batch size:** 32 |
|
|
- **Learning rate:** 2e-5 |
|
|
- **Optimizer:** AdamW (with default weight decay) |
|
|
- **Number of epochs:** 10 |
|
|
- **Evaluation strategy:** Evaluate on validation split at end of each epoch |
|
|
- **Checkpointing:** Saved best model by validation accuracy |
|
|
- **Random seed:** 42 |
|
|
|
|
|
## Evaluation Results (Test Set) |
|
|
- **Test accuracy:** 0.9395 |
|
|
- **Test F₁‐macro:** 0.7596 |
|
|
- **F₁ label 0 (Contradiction):** 0.9191 |
|
|
- **F₁ label 1 (Entailment):** 0.6022 |
|
|
- **F₁ label 2 (Neutral):** 0.7575 |
|
|
|
|
|
These metrics were computed on the held‐out InferBR test split. |
|
|
- `accuracy` = (number of correctly predicted labels) / (total number of examples) |
|
|
- `f1_macro` = unweighted average F₁ across labels {0,1,2} |
|
|
|
|
|
## Limitations |
|
|
- **Imbalanced performance:** Label 1 (Entailment) has lower F₁ (0.6022), indicating the model sometimes confuses entailment examples. |
|
|
- **Domain specificity:** Trained on InferBR, which consists of generic NLI pairs. May not generalize to highly specialized or technical domains (e.g., legal, medical). |
|
|
- **Language restrictions:** Only supports Brazilian Portuguese. Performance on European Portuguese or code‐switched text is not guaranteed. |
|
|
- **Bias and fairness:** InferBR may contain topics or writing styles that do not cover all registers of Portuguese. Use caution if deploying in production for sensitive tasks. |
|
|
|
|
|
## How to Use |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# 1. Load tokenizer and model from HuggingFace |
|
|
tokenizer = AutoTokenizer.from_pretrained("felipesfpaula/bertimbau-large-InferBr-NLI") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("felipesfpaula/bertimbau-large-InferBr-NLI") |
|
|
|
|
|
# 2. Encode a premise–hypothesis pair |
|
|
premise = "O gato está sentado no sofá." |
|
|
hypothesis = "O gato está deitado no sofá." |
|
|
encoded = tokenizer(premise, hypothesis, return_tensors="pt", max_length=128, truncation=True, padding="max_length") |
|
|
|
|
|
# 3. Run inference |
|
|
with torch.no_grad(): |
|
|
outputs = model(**encoded) |
|
|
logits = outputs.logits |
|
|
pred_id = torch.argmax(logits, dim=-1).item() |
|
|
|
|
|
# 4. Map prediction to label |
|
|
label_map = {0: "Contradiction", 1: "Entailment", 2: "Neutral"} |
|
|
print(f"Predicted label: {label_map[pred_id]}") |