FinDeBERTa / README.md
ritessshhh's picture
Update README.md
1a6206f verified
---
language:
- en
license: mit
tags:
- text-classification
- multi-label-classification
- financial-nlp
- finance
- event-detection
datasets:
- sentivent
metrics:
- f1
- precision
- recall
pipeline_tag: text-classification
---
# FinDeBERTa: Multi-Label Financial Event Classifier
FinDeBERTa is a fine-tuned DeBERTa-v3-Large model for multi-label financial event classification. It predicts one or more event types from financial news headlines with state-of-the-art performance.
## Model Details
- **Base Model**: microsoft/deberta-v3-large
- **Task**: Multi-label text classification
- **Training**: Fine-tuned with Focal Loss and per-class threshold optimization
## Event Labels
The model classifies text into 18 financial event types:
```python
["CSR/Brand", "Deal", "Dividend", "Employment", "Expense", "Facility",
"FinancialReport", "Financing", "Investment", "Legal", "Macroeconomics",
"Merger/Acquisition", "Product/Service", "Profit/Loss", "Rating", "Revenue",
"SalesVolume", "SecurityValue"]
```
## Usage
### Optimized Usage (with per-class thresholds) (Recommended)
For best performance, use the per-class optimized thresholds:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np
from huggingface_hub import hf_hub_download
model_name = "ritessshhh/FinDeBERTa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Download per-class thresholds
thresholds_path = hf_hub_download(repo_id=model_name, filename="thresholds.npy")
thresholds = np.load(thresholds_path)
text = "Tesla to acquire a battery startup in a 400 million dollar deal."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits)[0].cpu().numpy()
# Apply per-class thresholds
predictions = [
{"label": model.config.id2label[i], "score": float(prob)}
for i, prob in enumerate(probs) if prob >= thresholds[i]
]
# Sort by score
predictions = sorted(predictions, key=lambda x: x["score"], reverse=True)
print(predictions)
# Output: [{"label": "Merger/Acquisition", "score": 0.98}, ...]
```
### Basic Usage (with default threshold)
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np
model_name = "ritessshhh/FinDeBERTa"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = "Tesla to acquire a battery startup in a 400 million dollar deal."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits)[0].cpu().numpy()
# Using default threshold of 0.5
threshold = 0.5
predictions = [
{"label": model.config.id2label[i], "score": float(prob)}
for i, prob in enumerate(probs) if prob >= threshold
]
print(predictions)
```
## Training Details
- **Loss Function**: Focal Loss (gamma=2.0) with dampened class weights
- **Optimizer**: AdamW with cosine learning rate scheduling
- **Batch Size**: 8 (with gradient accumulation steps=2)
- **Epochs**: 10
- **Learning Rate**: 2e-5
- **Weight Decay**: 0.02
## Performance
| Metric | Score |
|--------|-------|
| Macro F1 | 0.692 |
| Micro F1 | 0.691 |
| Precision (Macro) | 0.738 |
| Recall (Macro) | 0.691 |
| Exact Match Ratio | 0.532 |
### Per-Class Performance
| Label | F1 Score | Precision | Recall |
|-------|----------|-----------|--------|
| Dividend | 1.000 | 1.000 | 1.000 |
| Employment | 0.923 | 0.857 | 1.000 |
| Merger/Acquisition | 0.892 | 0.967 | 0.829 |
| Profit/Loss | 0.833 | 0.824 | 0.843 |
| SecurityValue | 0.829 | 0.843 | 0.815 |
| Rating | 0.790 | 0.800 | 0.780 |
| Revenue | 0.780 | 0.800 | 0.762 |
| SalesVolume | 0.748 | 0.833 | 0.678 |
| Financing | 0.714 | 0.714 | 0.714 |
| Deal | 0.696 | 0.889 | 0.571 |
## Limitations
- Optimized for financial news headlines (short text)
- May not generalize well to other domains
- Performance varies by event type (rare events like "Facility" have lower F1)
## Citation
If you use this model, please cite:
```bibtex
@misc{findeberta2024,
author = {ritessshhh},
title = {FinDeBERTa: Multi-Label Financial Event Classifier},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/ritessshhh/FinDeBERTa}}
}
```