DerivedFunction
/

FinRoBERTa

Model card Files Files and versions

DerivedFunction commited on Nov 20, 2025

Commit

75d34ad

·

verified ·

1 Parent(s): 6c46d9e

Update README.md

Files changed (1) hide show

README.md +23 -1

README.md CHANGED Viewed

@@ -6,4 +6,26 @@ language:
 - en
 base_model:
 - FacebookAI/roberta-base
----

 - en
 base_model:
 - FacebookAI/roberta-base
+---
+### 📘 Model Description
+**FinRoBerta** is a domain‑adapted variant of **RoBERTa‑base**, trained using **Domain‑Adaptive Pretraining (DAPT)** on the **DerivedFunction/sec-filings-snippets-10K** dataset. This dataset consists of curated excerpts from SEC 10‑K filings, enabling the model to better capture the specialized vocabulary, syntax, and discourse patterns of financial regulatory documents.
+Key characteristics:
+- **Base model**: RoBERTa‑base (general‑purpose pretrained transformer)
+- **Adaptation method**: Domain‑Adaptive Pretraining (DAPT)
+- **Domain corpus**: SEC 10‑K filings (snippets)
+- **Language**: English
+- **License**: Apache 2.0
+### 🔍 Intended Use
+- As a **foundation for downstream tasks** in financial NLP (e.g., classification, extraction, summarization)
+- Research into domain adaptation techniques and their impact on language model performance
+- Benchmarking DAPT workflows for financial/legal text corpora
+### ⚖️ Limitations
+- Not fine‑tuned for specific tasks (classification, QA, summarization) — requires further adaptation for task‑level performance
+- Inherits biases from both the RoBERTa base corpus and SEC filings
+- Not suitable for predictive financial advice or trading decisions