File size: 5,163 Bytes
77cfc79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
license: mit
language:
  - en
tags:
  - finance
  - text-generation
  - mixture-of-experts
  - continual-learning
  - financial-nlp
  - custom-architecture
library_name: transformers
pipeline_tag: text-generation
---

# Meridian.AI β€” Finance Language Model

Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.

> **Not financial advice.** This is an experimental research model.

---

## Model Details

| Property | Value |
|---|---|
| Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding |
| Total parameters | ~479M (tied embeddings) |
| Unique parameters | ~283M |
| Experts | 8 total, top-2 active per token |
| Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) |
| Context length | 2048 tokens |
| Training method | Continual learning with EWC (Elastic Weight Consolidation) |
| License | MIT |

---

## Architecture

Meridian.AI is a fully custom transformer built from scratch with the following components:

- **Sparse MoE FFN** β€” 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
- **Grouped Query Attention (GQA)** β€” 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
- **Rotary Position Embeddings (RoPE)** β€” `rope_theta=500,000` for length generalisation.
- **SwiGLU FFN** β€” activation function used in dense layers and expert FFNs.
- **RMSNorm** β€” replaces LayerNorm for faster normalisation.
- **Financial Numeracy Encoding** β€” a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
- **Elastic Weight Consolidation (EWC)** β€” prevents catastrophic forgetting across continual training runs.
- **Tied word embeddings** β€” input embeddings and `lm_head` share weights, saving ~197M parameters.

---

## How to Use

> The model weights are stored under the `checkpoint/` subfolder in this repo.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "meridianal/FinAI"

tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    subfolder="checkpoint",
    trust_remote_code=True,
    torch_dtype=torch.float32,
    low_cpu_mem_usage=True,
)
model.eval()

prompt = """### Instruction:
What does a high price-to-earnings ratio indicate about a stock?

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=True,
        temperature=0.8,
        top_p=0.92,
        repetition_penalty=1.3,
        no_repeat_ngram_size=3,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))
```

### Prompt format

All training examples use this instruction/response format:

```
### Instruction:
<your question or task>

### Response:
<answer>
```

Classification tasks are also formatted this way with a short label-only response.

### Generation tips

Continual training can introduce mild repetition. Recommended settings:

| Parameter | Range |
|---|---|
| `temperature` | 0.7 – 0.95 |
| `top_p` | 0.85 – 0.95 |
| `repetition_penalty` | 1.2 – 1.4 |
| `no_repeat_ngram_size` | 3 |

If you see repeated phrases, increase `repetition_penalty` and lower `temperature`.

---

## Training Data

Training streams finance datasets from the FinanceMTEB family:

- Financial sentiment analysis (FinancialPhraseBank, etc.)
- ESG and sustainability classification
- FOMC statement analysis
- Fraud and financial complaint datasets
- Financial QA pairs
- Earnings call and filing excerpts

Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.

---

## Continual Learning

The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:

- **EWC regularisation** β€” Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
- **RAM-safe checkpointing** β€” training halts and saves before hitting memory limits (`MAX_RAM_GB=13`).
- **Optimizer-free saves** β€” AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
- **Auto-recovery** β€” each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.

---

## Limitations

- Experimental model β€” outputs may be incorrect, hallucinated, or outdated.
- Not intended for production financial applications.
- Continual training without human evaluation gates means quality can regress between runs.
- Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.

---

## Source Code

Training pipeline, architecture, and CI workflows:  
[github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)