Spaces:
Running
Running
metadata
title: README
emoji: π»
colorFrom: gray
colorTo: blue
sdk: static
pinned: false
README π»
FinText: A repository of Financial LLMs
π Stage 1 Release π
We are thrilled to introduce a specialized collection of 68 large language models (LLMs), meticulously designed for the accounting and finance. The FinText models have been pre-trained on domain-specific historical data, addressing challenges like look-ahead bias and information leakage. These models are crafted to elevate the accuracy and depth of financial research and analysis.
π‘ Key Features:
- Domain-Specific Training: FinText utilises diverse financial datasets such as news articles, regulatory filings, IP records, corporate speeches (ECB, FED), and more.
- Time-Period Specific Models: Separate models are pre-trained for each year from 2007 to 2023, ensuring the utmost precision and historical relevance.
- RoBERTa Architecture: The suite includes both a base model with 125 million parameters and a smaller variant with 51 million parametersβtotalling 34 pre-trained models. π―
- Two distinct pre-training durations: We also introduce a series of models to explore the impact of futher pre-training. These models are pre-trained for an additional 5 epochs, extending the total pre-training epochs to 10.
Stay tuned for upcoming updates and new features for FinText. We expect to launch stages 2 and 3 within the next 9 months. π