| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - finance |
| - text-generation |
| - mixture-of-experts |
| - continual-learning |
| - financial-nlp |
| - custom-architecture |
| library_name: transformers |
| pipeline_tag: text-generation |
| --- |
| |
| # Meridian.AI β Finance Language Model |
|
|
| Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs. |
|
|
| > **Not financial advice.** This is an experimental research model. |
|
|
| --- |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |---|---| |
| | Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding | |
| | Total parameters | ~479M (tied embeddings) | |
| | Unique parameters | ~283M | |
| | Experts | 8 total, top-2 active per token | |
| | Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) | |
| | Context length | 2048 tokens | |
| | Training method | Continual learning with EWC (Elastic Weight Consolidation) | |
| | License | MIT | |
|
|
| --- |
|
|
| ## Architecture |
|
|
| Meridian.AI is a fully custom transformer built from scratch with the following components: |
|
|
| - **Sparse MoE FFN** β 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer. |
| - **Grouped Query Attention (GQA)** β 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference. |
| - **Rotary Position Embeddings (RoPE)** β `rope_theta=500,000` for length generalisation. |
| - **SwiGLU FFN** β activation function used in dense layers and expert FFNs. |
| - **RMSNorm** β replaces LayerNorm for faster normalisation. |
| - **Financial Numeracy Encoding** β a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks. |
| - **Elastic Weight Consolidation (EWC)** β prevents catastrophic forgetting across continual training runs. |
| - **Tied word embeddings** β input embeddings and `lm_head` share weights, saving ~197M parameters. |
|
|
| --- |
|
|
| ## How to Use |
|
|
| > The model weights are stored under the `checkpoint/` subfolder in this repo. |
|
|
| ```python |
| import torch |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| repo_id = "meridianal/FinAI" |
| |
| tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint") |
| model = AutoModelForCausalLM.from_pretrained( |
| repo_id, |
| subfolder="checkpoint", |
| trust_remote_code=True, |
| torch_dtype=torch.float32, |
| low_cpu_mem_usage=True, |
| ) |
| model.eval() |
| |
| prompt = """### Instruction: |
| What does a high price-to-earnings ratio indicate about a stock? |
| |
| ### Response: |
| """ |
| |
| inputs = tokenizer(prompt, return_tensors="pt") |
| with torch.no_grad(): |
| out = model.generate( |
| **inputs, |
| max_new_tokens=200, |
| do_sample=True, |
| temperature=0.8, |
| top_p=0.92, |
| repetition_penalty=1.3, |
| no_repeat_ngram_size=3, |
| pad_token_id=tokenizer.pad_token_id, |
| eos_token_id=tokenizer.eos_token_id, |
| ) |
| |
| print(tokenizer.decode(out[0], skip_special_tokens=True)) |
| ``` |
|
|
| ### Prompt format |
|
|
| All training examples use this instruction/response format: |
|
|
| ``` |
| ### Instruction: |
| <your question or task> |
| |
| ### Response: |
| <answer> |
| ``` |
|
|
| Classification tasks are also formatted this way with a short label-only response. |
|
|
| ### Generation tips |
|
|
| Continual training can introduce mild repetition. Recommended settings: |
|
|
| | Parameter | Range | |
| |---|---| |
| | `temperature` | 0.7 β 0.95 | |
| | `top_p` | 0.85 β 0.95 | |
| | `repetition_penalty` | 1.2 β 1.4 | |
| | `no_repeat_ngram_size` | 3 | |
|
|
| If you see repeated phrases, increase `repetition_penalty` and lower `temperature`. |
|
|
| --- |
|
|
| ## Training Data |
|
|
| Training streams finance datasets from the FinanceMTEB family: |
|
|
| - Financial sentiment analysis (FinancialPhraseBank, etc.) |
| - ESG and sustainability classification |
| - FOMC statement analysis |
| - Fraud and financial complaint datasets |
| - Financial QA pairs |
| - Earnings call and filing excerpts |
|
|
| Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits. |
|
|
| --- |
|
|
| ## Continual Learning |
|
|
| The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features: |
|
|
| - **EWC regularisation** β Fisher information matrix computed from recent data protects previously learned weights from being overwritten. |
| - **RAM-safe checkpointing** β training halts and saves before hitting memory limits (`MAX_RAM_GB=13`). |
| - **Optimizer-free saves** β AdaFactor optimizer state is discarded before upload to keep checkpoint size small. |
| - **Auto-recovery** β each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off. |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - Experimental model β outputs may be incorrect, hallucinated, or outdated. |
| - Not intended for production financial applications. |
| - Continual training without human evaluation gates means quality can regress between runs. |
| - Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate. |
|
|
| --- |
|
|
| ## Source Code |
|
|
| Training pipeline, architecture, and CI workflows: |
| [github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI) |
|
|