license: mit
language:
- en
tags:
- finance
- text-generation
- mixture-of-experts
- continual-learning
- financial-nlp
- custom-architecture
library_name: transformers
pipeline_tag: text-generation
Meridian.AI β Finance Language Model
Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.
Not financial advice. This is an experimental research model.
Model Details
| Property | Value |
|---|---|
| Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding |
| Total parameters | ~479M (tied embeddings) |
| Unique parameters | ~283M |
| Experts | 8 total, top-2 active per token |
| Tokenizer | Qwen/Qwen2.5-0.5B (151k vocab) |
| Context length | 2048 tokens |
| Training method | Continual learning with EWC (Elastic Weight Consolidation) |
| License | MIT |
Architecture
Meridian.AI is a fully custom transformer built from scratch with the following components:
- Sparse MoE FFN β 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
- Grouped Query Attention (GQA) β 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
- Rotary Position Embeddings (RoPE) β
rope_theta=500,000for length generalisation. - SwiGLU FFN β activation function used in dense layers and expert FFNs.
- RMSNorm β replaces LayerNorm for faster normalisation.
- Financial Numeracy Encoding β a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
- Elastic Weight Consolidation (EWC) β prevents catastrophic forgetting across continual training runs.
- Tied word embeddings β input embeddings and
lm_headshare weights, saving ~197M parameters.
How to Use
The model weights are stored under the
checkpoint/subfolder in this repo.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "meridianal/FinAI"
tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
model = AutoModelForCausalLM.from_pretrained(
repo_id,
subfolder="checkpoint",
trust_remote_code=True,
torch_dtype=torch.float32,
low_cpu_mem_usage=True,
)
model.eval()
prompt = """### Instruction:
What does a high price-to-earnings ratio indicate about a stock?
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=200,
do_sample=True,
temperature=0.8,
top_p=0.92,
repetition_penalty=1.3,
no_repeat_ngram_size=3,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Prompt format
All training examples use this instruction/response format:
### Instruction:
<your question or task>
### Response:
<answer>
Classification tasks are also formatted this way with a short label-only response.
Generation tips
Continual training can introduce mild repetition. Recommended settings:
| Parameter | Range |
|---|---|
temperature |
0.7 β 0.95 |
top_p |
0.85 β 0.95 |
repetition_penalty |
1.2 β 1.4 |
no_repeat_ngram_size |
3 |
If you see repeated phrases, increase repetition_penalty and lower temperature.
Training Data
Training streams finance datasets from the FinanceMTEB family:
- Financial sentiment analysis (FinancialPhraseBank, etc.)
- ESG and sustainability classification
- FOMC statement analysis
- Fraud and financial complaint datasets
- Financial QA pairs
- Earnings call and filing excerpts
Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.
Continual Learning
The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:
- EWC regularisation β Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
- RAM-safe checkpointing β training halts and saves before hitting memory limits (
MAX_RAM_GB=13). - Optimizer-free saves β AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
- Auto-recovery β each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.
Limitations
- Experimental model β outputs may be incorrect, hallucinated, or outdated.
- Not intended for production financial applications.
- Continual training without human evaluation gates means quality can regress between runs.
- Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.
Source Code
Training pipeline, architecture, and CI workflows:
github.com/MeridianAlgo/FinAI