FinAI / README.md
meridianal's picture
Add proper model card
77cfc79 verified
---
license: mit
language:
- en
tags:
- finance
- text-generation
- mixture-of-experts
- continual-learning
- financial-nlp
- custom-architecture
library_name: transformers
pipeline_tag: text-generation
---
# Meridian.AI β€” Finance Language Model
Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.
> **Not financial advice.** This is an experimental research model.
---
## Model Details
| Property | Value |
|---|---|
| Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding |
| Total parameters | ~479M (tied embeddings) |
| Unique parameters | ~283M |
| Experts | 8 total, top-2 active per token |
| Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) |
| Context length | 2048 tokens |
| Training method | Continual learning with EWC (Elastic Weight Consolidation) |
| License | MIT |
---
## Architecture
Meridian.AI is a fully custom transformer built from scratch with the following components:
- **Sparse MoE FFN** β€” 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
- **Grouped Query Attention (GQA)** β€” 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
- **Rotary Position Embeddings (RoPE)** β€” `rope_theta=500,000` for length generalisation.
- **SwiGLU FFN** β€” activation function used in dense layers and expert FFNs.
- **RMSNorm** β€” replaces LayerNorm for faster normalisation.
- **Financial Numeracy Encoding** β€” a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
- **Elastic Weight Consolidation (EWC)** β€” prevents catastrophic forgetting across continual training runs.
- **Tied word embeddings** β€” input embeddings and `lm_head` share weights, saving ~197M parameters.
---
## How to Use
> The model weights are stored under the `checkpoint/` subfolder in this repo.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "meridianal/FinAI"
tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
model = AutoModelForCausalLM.from_pretrained(
repo_id,
subfolder="checkpoint",
trust_remote_code=True,
torch_dtype=torch.float32,
low_cpu_mem_usage=True,
)
model.eval()
prompt = """### Instruction:
What does a high price-to-earnings ratio indicate about a stock?
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=200,
do_sample=True,
temperature=0.8,
top_p=0.92,
repetition_penalty=1.3,
no_repeat_ngram_size=3,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(out[0], skip_special_tokens=True))
```
### Prompt format
All training examples use this instruction/response format:
```
### Instruction:
<your question or task>
### Response:
<answer>
```
Classification tasks are also formatted this way with a short label-only response.
### Generation tips
Continual training can introduce mild repetition. Recommended settings:
| Parameter | Range |
|---|---|
| `temperature` | 0.7 – 0.95 |
| `top_p` | 0.85 – 0.95 |
| `repetition_penalty` | 1.2 – 1.4 |
| `no_repeat_ngram_size` | 3 |
If you see repeated phrases, increase `repetition_penalty` and lower `temperature`.
---
## Training Data
Training streams finance datasets from the FinanceMTEB family:
- Financial sentiment analysis (FinancialPhraseBank, etc.)
- ESG and sustainability classification
- FOMC statement analysis
- Fraud and financial complaint datasets
- Financial QA pairs
- Earnings call and filing excerpts
Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.
---
## Continual Learning
The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:
- **EWC regularisation** β€” Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
- **RAM-safe checkpointing** β€” training halts and saves before hitting memory limits (`MAX_RAM_GB=13`).
- **Optimizer-free saves** β€” AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
- **Auto-recovery** β€” each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.
---
## Limitations
- Experimental model β€” outputs may be incorrect, hallucinated, or outdated.
- Not intended for production financial applications.
- Continual training without human evaluation gates means quality can regress between runs.
- Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.
---
## Source Code
Training pipeline, architecture, and CI workflows:
[github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)