--- license: mit language: - en tags: - finance - text-generation - mixture-of-experts - continual-learning - financial-nlp - custom-architecture library_name: transformers pipeline_tag: text-generation --- # Meridian.AI — Finance Language Model Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs. > **Not financial advice.** This is an experimental research model. --- ## Model Details | Property | Value | |---|---| | Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding | | Total parameters | ~479M (tied embeddings) | | Unique parameters | ~283M | | Experts | 8 total, top-2 active per token | | Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) | | Context length | 2048 tokens | | Training method | Continual learning with EWC (Elastic Weight Consolidation) | | License | MIT | --- ## Architecture Meridian.AI is a fully custom transformer built from scratch with the following components: - **Sparse MoE FFN** — 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer. - **Grouped Query Attention (GQA)** — 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference. - **Rotary Position Embeddings (RoPE)** — `rope_theta=500,000` for length generalisation. - **SwiGLU FFN** — activation function used in dense layers and expert FFNs. - **RMSNorm** — replaces LayerNorm for faster normalisation. - **Financial Numeracy Encoding** — a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks. - **Elastic Weight Consolidation (EWC)** — prevents catastrophic forgetting across continual training runs. - **Tied word embeddings** — input embeddings and `lm_head` share weights, saving ~197M parameters. --- ## How to Use > The model weights are stored under the `checkpoint/` subfolder in this repo. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer repo_id = "meridianal/FinAI" tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint") model = AutoModelForCausalLM.from_pretrained( repo_id, subfolder="checkpoint", trust_remote_code=True, torch_dtype=torch.float32, low_cpu_mem_usage=True, ) model.eval() prompt = """### Instruction: What does a high price-to-earnings ratio indicate about a stock? ### Response: """ inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): out = model.generate( **inputs, max_new_tokens=200, do_sample=True, temperature=0.8, top_p=0.92, repetition_penalty=1.3, no_repeat_ngram_size=3, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, ) print(tokenizer.decode(out[0], skip_special_tokens=True)) ``` ### Prompt format All training examples use this instruction/response format: ``` ### Instruction: ### Response: ``` Classification tasks are also formatted this way with a short label-only response. ### Generation tips Continual training can introduce mild repetition. Recommended settings: | Parameter | Range | |---|---| | `temperature` | 0.7 – 0.95 | | `top_p` | 0.85 – 0.95 | | `repetition_penalty` | 1.2 – 1.4 | | `no_repeat_ngram_size` | 3 | If you see repeated phrases, increase `repetition_penalty` and lower `temperature`. --- ## Training Data Training streams finance datasets from the FinanceMTEB family: - Financial sentiment analysis (FinancialPhraseBank, etc.) - ESG and sustainability classification - FOMC statement analysis - Fraud and financial complaint datasets - Financial QA pairs - Earnings call and filing excerpts Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits. --- ## Continual Learning The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features: - **EWC regularisation** — Fisher information matrix computed from recent data protects previously learned weights from being overwritten. - **RAM-safe checkpointing** — training halts and saves before hitting memory limits (`MAX_RAM_GB=13`). - **Optimizer-free saves** — AdaFactor optimizer state is discarded before upload to keep checkpoint size small. - **Auto-recovery** — each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off. --- ## Limitations - Experimental model — outputs may be incorrect, hallucinated, or outdated. - Not intended for production financial applications. - Continual training without human evaluation gates means quality can regress between runs. - Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate. --- ## Source Code Training pipeline, architecture, and CI workflows: [github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)