File size: 5,163 Bytes

77cfc79

---
license: mit
language:
  - en
tags:
  - finance
  - text-generation
  - mixture-of-experts
  - continual-learning
  - financial-nlp
  - custom-architecture
library_name: transformers
pipeline_tag: text-generation
---

# Meridian.AI — Finance Language Model

Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.

> **Not financial advice.** This is an experimental research model.

---

## Model Details

| Property | Value |
|---|---|
| Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding |
| Total parameters | ~479M (tied embeddings) |
| Unique parameters | ~283M |
| Experts | 8 total, top-2 active per token |
| Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) |
| Context length | 2048 tokens |
| Training method | Continual learning with EWC (Elastic Weight Consolidation) |
| License | MIT |

---

## Architecture

Meridian.AI is a fully custom transformer built from scratch with the following components:

- **Sparse MoE FFN** — 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
- **Grouped Query Attention (GQA)** — 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
- **Rotary Position Embeddings (RoPE)** — `rope_theta=500,000` for length generalisation.
- **SwiGLU FFN** — activation function used in dense layers and expert FFNs.
- **RMSNorm** — replaces LayerNorm for faster normalisation.
- **Financial Numeracy Encoding** — a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
- **Elastic Weight Consolidation (EWC)** — prevents catastrophic forgetting across continual training runs.
- **Tied word embeddings** — input embeddings and `lm_head` share weights, saving ~197M parameters.

---

## How to Use

> The model weights are stored under the `checkpoint/` subfolder in this repo.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "meridianal/FinAI"

tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    subfolder="checkpoint",
    trust_remote_code=True,
    torch_dtype=torch.float32,
    low_cpu_mem_usage=True,
)
model.eval()

prompt = """### Instruction:
What does a high price-to-earnings ratio indicate about a stock?

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=True,
        temperature=0.8,
        top_p=0.92,
        repetition_penalty=1.3,
        no_repeat_ngram_size=3,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))
```

### Prompt format

All training examples use this instruction/response format:

```
### Instruction:
<your question or task>

### Response:
<answer>
```

Classification tasks are also formatted this way with a short label-only response.

### Generation tips

Continual training can introduce mild repetition. Recommended settings:

| Parameter | Range |
|---|---|
| `temperature` | 0.7 – 0.95 |
| `top_p` | 0.85 – 0.95 |
| `repetition_penalty` | 1.2 – 1.4 |
| `no_repeat_ngram_size` | 3 |

If you see repeated phrases, increase `repetition_penalty` and lower `temperature`.

---

## Training Data

Training streams finance datasets from the FinanceMTEB family:

- Financial sentiment analysis (FinancialPhraseBank, etc.)
- ESG and sustainability classification
- FOMC statement analysis
- Fraud and financial complaint datasets
- Financial QA pairs
- Earnings call and filing excerpts

Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.

---

## Continual Learning

The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:

- **EWC regularisation** — Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
- **RAM-safe checkpointing** — training halts and saves before hitting memory limits (`MAX_RAM_GB=13`).
- **Optimizer-free saves** — AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
- **Auto-recovery** — each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.

---

## Limitations

- Experimental model — outputs may be incorrect, hallucinated, or outdated.
- Not intended for production financial applications.
- Continual training without human evaluation gates means quality can regress between runs.
- Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.

---

## Source Code

Training pipeline, architecture, and CI workflows:  
[github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)