File size: 4,311 Bytes
5b58d01
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---

license: apache-2.0
language:
- en
library_name: transformers
tags:
- finance
- aspect-classification
- absa
- finbert
- text-classification
datasets:
- pauri32/fiqa-2018
base_model: ProsusAI/finbert
metrics:
- accuracy
- f1
pipeline_tag: text-classification
---


# ABSA-FinBERT: Aspect Classification for Financial Text

This model classifies financial headlines and tweets into four aspect categories: **Corporate**, **Economy**, **Market**, and **Stock**.

## Model Description

ABSA-FinBERT is a fine-tuned version of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) for Level-1 aspect classification on the FiQA dataset. The model was trained with class-weighted cross-entropy loss to address extreme class imbalance in the training data.

This work is motivated by [Yang et al. (2018)](https://arxiv.org/abs/1808.07931), "Financial Aspect-Based Sentiment Analysis using Deep Representations," which demonstrated that financial text often contains multi-dimensional information requiring aspect-level analysis.

## Intended Use

- Classifying financial news headlines by topic/aspect
- Preprocessing step for aspect-based sentiment analysis pipelines
- Financial text categorization

## Training Data

Trained on the [FiQA dataset](https://huggingface.co/datasets/pauri32/fiqa-2018) (WWW'18 Open Challenge), with Level-1 aspect labels extracted from hierarchical annotations.

| Aspect | Training Examples | Percentage |
|--------|-------------------|------------|
| Stock | 562 | 58.5% |
| Corporate | 367 | 38.2% |
| Market | 26 | 2.7% |
| Economy | 4 | 0.4% |

### Class Weights Applied
Due to extreme imbalance, inverse frequency weights were used: Corporate (0.65), Economy (59.94), Market (9.22), Stock (0.43).

## Performance

| Metric | Score |
|--------|-------|
| Accuracy | 88.59% |
| Macro-F1 | 0.5429 |
| Weighted-F1 | 0.8688 |

### Per-Class Results

| Aspect | Precision | Recall | F1-Score | Support |
|--------|-----------|--------|----------|---------|
| Corporate | 0.91 | 0.94 | 0.92 | 64 |
| Economy | 0.00 | 0.00 | 0.00 | 3 |
| Market | 0.50 | 0.25 | 0.33 | 8 |
| Stock | 0.89 | 0.95 | 0.92 | 74 |

**Note:** The model performs well on majority classes but fails on Economy due to having only 4 training examples. Class weighting cannot overcome severe data scarcity.

## Usage

```python

from transformers import AutoTokenizer, AutoModelForSequenceClassification

import torch



tokenizer = AutoTokenizer.from_pretrained("your-username/absa-finbert")

model = AutoModelForSequenceClassification.from_pretrained("your-username/absa-finbert")



# Label mapping

id2label = {0: "Corporate", 1: "Economy", 2: "Market", 3: "Stock"}



# Example inference

text = "How Kraft-Heinz Merger Came Together in Speedy 10 Weeks"

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

outputs = model(**inputs)

prediction = torch.argmax(outputs.logits, dim=-1).item()



print(f"Aspect: {id2label[prediction]}")  # Output: Corporate

```

## Training Procedure

- **Base model:** ProsusAI/finbert
- **Learning rate:** 3e-5
- **Batch size:** 16 (effective 32 with gradient accumulation)
- **Epochs:** 10 (early stopping patience: 3)
- **Loss:** Weighted cross-entropy
- **Optimizer:** AdamW with warmup (10%)
- **Mixed precision:** FP16

## Limitations

- Economy class is effectively unlearnable with only 4 training examples
- Market class has limited representation (26 examples)
- Model is optimized for short financial headlines/tweets, not long-form text

## Citation

If you use this model, please cite:

```bibtex

@misc{absa-finbert-2025,

  title={ABSA-FinBERT: Aspect Classification for Financial Text},

  author={Cirillo, Nick and Memon, Suha and Truong, Kalen and Zhang, Bruce},

  year={2025},

  howpublished={\url{https://huggingface.co/your-username/absa-finbert}}

}

```

## References

- Yang, S., Rosenfeld, J., & Makutonin, J. (2018). Financial Aspect-Based Sentiment Analysis using Deep Representations. arXiv:1808.07931.
- Araci, D. (2019). FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. arXiv:1908.10063.
- Maia, M., et al. (2018). WWW'18 Open Challenge: Financial Opinion Mining and Question Answering.