Meridian.AI β€” Finance Language Model

Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.

Not financial advice. This is an experimental research model.


Model Details

Property Value
Architecture Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding
Total parameters ~479M (tied embeddings)
Unique parameters ~283M
Experts 8 total, top-2 active per token
Tokenizer Qwen/Qwen2.5-0.5B (151k vocab)
Context length 2048 tokens
Training method Continual learning with EWC (Elastic Weight Consolidation)
License MIT

Architecture

Meridian.AI is a fully custom transformer built from scratch with the following components:

  • Sparse MoE FFN β€” 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
  • Grouped Query Attention (GQA) β€” 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
  • Rotary Position Embeddings (RoPE) β€” rope_theta=500,000 for length generalisation.
  • SwiGLU FFN β€” activation function used in dense layers and expert FFNs.
  • RMSNorm β€” replaces LayerNorm for faster normalisation.
  • Financial Numeracy Encoding β€” a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
  • Elastic Weight Consolidation (EWC) β€” prevents catastrophic forgetting across continual training runs.
  • Tied word embeddings β€” input embeddings and lm_head share weights, saving ~197M parameters.

How to Use

The model weights are stored under the checkpoint/ subfolder in this repo.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "meridianal/FinAI"

tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    subfolder="checkpoint",
    trust_remote_code=True,
    torch_dtype=torch.float32,
    low_cpu_mem_usage=True,
)
model.eval()

prompt = """### Instruction:
What does a high price-to-earnings ratio indicate about a stock?

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=True,
        temperature=0.8,
        top_p=0.92,
        repetition_penalty=1.3,
        no_repeat_ngram_size=3,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))

Prompt format

All training examples use this instruction/response format:

### Instruction:
<your question or task>

### Response:
<answer>

Classification tasks are also formatted this way with a short label-only response.

Generation tips

Continual training can introduce mild repetition. Recommended settings:

Parameter Range
temperature 0.7 – 0.95
top_p 0.85 – 0.95
repetition_penalty 1.2 – 1.4
no_repeat_ngram_size 3

If you see repeated phrases, increase repetition_penalty and lower temperature.


Training Data

Training streams finance datasets from the FinanceMTEB family:

  • Financial sentiment analysis (FinancialPhraseBank, etc.)
  • ESG and sustainability classification
  • FOMC statement analysis
  • Fraud and financial complaint datasets
  • Financial QA pairs
  • Earnings call and filing excerpts

Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.


Continual Learning

The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:

  • EWC regularisation β€” Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
  • RAM-safe checkpointing β€” training halts and saves before hitting memory limits (MAX_RAM_GB=13).
  • Optimizer-free saves β€” AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
  • Auto-recovery β€” each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.

Limitations

  • Experimental model β€” outputs may be incorrect, hallucinated, or outdated.
  • Not intended for production financial applications.
  • Continual training without human evaluation gates means quality can regress between runs.
  • Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.

Source Code

Training pipeline, architecture, and CI workflows:
github.com/MeridianAlgo/FinAI

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support