Kenpache
/

flame

@@ -12,6 +12,9 @@ tags:
   - sentiment-analysis
   - multilingual
   - xlm-roberta
 datasets:
   - Kenpache/multilingual-financial-sentiment
 metrics:
@@ -33,13 +36,13 @@ model-index:
             value: 0.8102
 ---
-# FLAME - Financial Language Analysis for Multilingual Economics
 **One model. Six languages. Real financial sentiment.**
-FLAME classifies financial text as **Negative**, **Neutral**, or **Positive** across English, Chinese, Japanese, German, French, and Spanish.
-Built on XLM-RoBERTa-base, domain-adapted on 35K+ financial texts, fine-tuned on ~39K multilingual financial news samples.
 ## Quick Start
@@ -48,39 +51,129 @@ from transformers import pipeline
 classifier = pipeline("text-classification", model="Kenpache/flame")
-classifier("Revenue surged 40% year-over-year, beating analyst expectations.")
-# [{'label': 'positive', 'score': 0.96}]
-classifier("La empresa reportó pérdidas significativas este trimestre.")
-# [{'label': 'negative', 'score': 0.93}]
 ```
 ## Results
 | Metric | Score |
 |---|---|
-| Accuracy | **0.8103** |
-| F1 (weighted) | **0.8102** |
-| Precision | **0.8111** |
-| Recall | **0.8103** |
-| Class | Precision | Recall | F1 |
 |---|---|---|---|
-| Negative | 0.78 | 0.83 | 0.81 |
-| Neutral | 0.83 | 0.79 | 0.81 |
-| Positive | 0.80 | 0.82 | 0.81 |
-## Languages
-EN | ZH | JA | DE | FR | ES
-## Training
-XLM-RoBERTa-base + Task-Adaptive Pre-Training (MLM) + fine-tuning with label smoothing, cosine LR schedule, and SWA checkpoint averaging.
 ## Dataset
-[Kenpache/multilingual-financial-sentiment](https://huggingface.co/datasets/Kenpache/multilingual-financial-sentiment) -- ~39K samples from CNBC, Yahoo Finance, Reuters, Nikkei, Sina Finance, and 80+ other financial news sources.
 ## License

   - sentiment-analysis
   - multilingual
   - xlm-roberta
+  - financial-nlp
+  - stock-market
+  - trading
 datasets:
   - Kenpache/multilingual-financial-sentiment
 metrics:
             value: 0.8102
 ---
+# FLAME — Financial Language Analysis for Multilingual Economics
 **One model. Six languages. Real financial sentiment.**
+FLAME classifies financial text as **Negative**, **Neutral**, or **Positive** across English, Chinese, Japanese, German, French, and Spanish — in a single model, no language detection needed.
+Built on XLM-RoBERTa with domain-adaptive pretraining on 35K+ financial texts, then fine-tuned on ~39K real financial news samples from 80+ sources worldwide.
 ## Quick Start
 classifier = pipeline("text-classification", model="Kenpache/flame")
+# English
+classifier("Apple reported record quarterly revenue of $124 billion, up 11% year over year.")
+# [{'label': 'Positive', 'score': 0.96}]
+# Chinese
+classifier("该公司季度亏损扩大至5亿美元，远超市场预期。")
+# [{'label': 'Negative', 'score': 0.94}]
+# Japanese
+classifier("トヨタ自動車の営業利益は前年同期比30%増の1兆円を突破した。")
+# [{'label': 'Positive', 'score': 0.95}]
+# German
+classifier("Die Aktie verlor nach der Gewinnwarnung deutlich an Wert.")
+# [{'label': 'Negative', 'score': 0.92}]
+# French
+classifier("Le chiffre d'affaires du groupe a progressé de 8% au premier semestre.")
+# [{'label': 'Positive', 'score': 0.93}]
+# Spanish
+classifier("Las acciones de la empresa se mantuvieron estables tras la publicación de resultados.")
+# [{'label': 'Neutral', 'score': 0.89}]
+```
+## Batch Processing
+```python
+from transformers import pipeline
+classifier = pipeline("text-classification", model="Kenpache/flame", device=0)
+texts = [
+    "Stocks rallied after the Fed signaled a pause in rate hikes.",
+    "The company filed for Chapter 11 bankruptcy protection.",
+    "Q3 earnings were in line with analyst expectations.",
+    "日経平均株価が3万円台を回復した。",
+    "Les marchés européens ont clôturé en forte baisse.",
+    "El beneficio neto de la compañía creció un 25% interanual.",
+]
+results = classifier(texts, batch_size=32)
+for text, result in zip(texts, results):
+    print(f"{result['label']:>8} ({result['score']:.2f})  {text[:70]}")
 ```
 ## Results
 | Metric | Score |
 |---|---|
+| **Accuracy** | **0.8103** |
+| **F1 (weighted)** | **0.8102** |
+| **Precision (weighted)** | **0.8111** |
+| **Recall (weighted)** | **0.8103** |
+### Per-Class Performance
+| Class | Precision | Recall | F1 | Support |
+|---|---|---|---|---|
+| Negative | 0.78 | 0.83 | 0.81 | 917 |
+| Neutral | 0.83 | 0.79 | 0.81 | 1,779 |
+| Positive | 0.80 | 0.82 | 0.81 | 1,225 |
+All three classes achieve balanced F1=0.81, even with imbalanced training data (Neutral 45%, Positive 31%, Negative 24%).
+## Labels
+| Label | ID | What it captures |
+|---|---|---|
+| **Negative** | 0 | Losses, decline, bearish signals, layoffs, bankruptcy |
+| **Neutral** | 1 | Factual statements, announcements, no clear sentiment |
+| **Positive** | 2 | Growth, gains, bullish signals, record earnings, upgrades |
+## Supported Languages
+| Language | Code | Training Samples | Key Sources |
 |---|---|---|---|
+| Japanese | JA | 8,287 | Nikkei, Nikkan Kogyo, Reuters JP |
+| Chinese | ZH | 7,930 | Sina Finance, EastMoney, 10jqka |
+| Spanish | ES | 7,125 | Expansión, Cinco Días, Bloomberg Línea |
+| English | EN | 6,887 | CNBC, Yahoo Finance, Fortune, Reuters |
+| German | DE | 5,023 | Börse.de, FAZ, NTV Börse |
+| French | FR | 3,935 | Boursorama, Tradingsat, BFM Business |
+## Use Cases
+- **News Monitoring** — classify sentiment of financial headlines across global markets in real time
+- **Trading Signals** — feed sentiment scores into quantitative trading strategies
+- **Portfolio Risk** — monitor sentiment shifts across international holdings
+- **Earnings Analysis** — analyze tone of corporate press releases and earnings calls
+- **Social Media** — track financial discussions on multilingual platforms
+- **Research** — cross-language sentiment studies in financial NLP
+## How It Was Built
+1. **Domain Adaptation (TAPT):** Masked Language Modeling on 35K+ financial texts across 6 languages — the model learns financial vocabulary and patterns before seeing any labels.
+2. **Fine-Tuning:** Supervised classification with label smoothing (0.1), cosine LR schedule (2e-5), and Stochastic Weight Averaging of top-3 checkpoints for robust generalization.
+| Parameter | Value |
+|---|---|
+| Base model | xlm-roberta-base (278M params) |
+| Learning rate | 2e-5 |
+| Scheduler | Cosine |
+| Label smoothing | 0.1 |
+| Effective batch size | 64 |
+| Precision | FP16 |
+| Post-processing | SWA (top-3 checkpoints) |
 ## Dataset
+Trained on [Kenpache/multilingual-financial-sentiment](https://huggingface.co/datasets/Kenpache/multilingual-financial-sentiment) — ~39K curated financial news samples from 80+ real sources worldwide.
+## Citation
+```bibtex
+@misc{flame2025,
+  title={FLAME: Financial Language Analysis for Multilingual Economics},
+  author={Kenpache},
+  year={2025},
+  url={https://huggingface.co/Kenpache/flame}
+}
+```
 ## License