---
language:
- en
library_name: pytorch
tags:
- metis
- lernex
- causal-lm
- base-model
- education
- reasoning
- mixture-of-recursion
- custom-code
pipeline_tag: text-generation
base_model: []
---

# Metis-1.4 Base

**The model that never quit.**

Metis-1.4 Base is a compact ~504M-parameter research language model from **Lernex**, built as a step toward the Metis line of efficient learning, reasoning, and tutoring models.

This upload replaces the earlier experimental Metis-1.4 base artifact with the corrected current base export. The earlier run used an incorrect objective during training; this revision comes from the repaired pipeline using standard next-token prediction, the optimized H100 dense pretraining path, and sequence-level static MoR continued pretraining.

## What This Release Is

This is the **base checkpoint**. It is not the final chat or thinking model.

Use it as:

- a research base for continued training and post-training experiments
- a compact model for studying Lernex's Metis architecture direction
- a foundation checkpoint for the Metis-1.4 chat and thinking releases

The post-trained Chat SFT, Reasoning SFT, reward, Chat DPO, and Think DPO stages are still part of the full Metis-1.4 pipeline.

## Architecture

Metis-1.4 Base uses a custom Metis MoR decoder stack:

| Field | Value |
|---|---:|
| Parameters | ~503.8M |
| Context length | 1024 tokens |
| Layers | 19 shared transformer layers |
| Hidden size | 1536 |
| Attention heads | 24 |
| KV heads | 8 |
| Head dim | 64 |
| Vocab size | 16,384 |
| Activation | SwiGLU |
| Weight dtype | BF16 export |
| MoR max depth | 3 |
| Effective max layer count | 57 |

## Training Notes

The current base was trained with:

- repaired **next-token prediction** objective
- optimized H100 pretraining stack
- fused dense transformer path improvements
- static dense base pretraining
- sequence-level static MoR during continued pretraining
- exported BF16 weights in `safetensors`

The final CPT checkpoint ended with validation loss around `2.4341` and perplexity around `11.41` on the continued-pretraining validation split. This number is not directly comparable to instruction or benchmark performance; it is primarily a training-health metric for the base/CPT mixture.

## Files

- `model.safetensors` - exported base weights
- `config.json` - Metis architecture/config metadata
- `generation_config.json` - basic generation defaults
- `tokenizer.json` - tokenizer
- `tokenizer_config.json` - tokenizer metadata
- `special_tokens_map.json` - tokenizer special token ids

## Important Compatibility Note

Metis-1.4 uses a custom architecture: `metis_mor_transformer` / `MetisMoRLMHeadModel`.

This repository contains the weights and config, but loading requires the Metis runtime/model code from Lernex's training stack or an adapter that implements the same architecture. It is not intended to be a drop-in vanilla Transformers architecture checkpoint yet.

## Status

This is a research release from an active training run. The base is being shared early so others can inspect and experiment with the corrected model artifact while the post-training pipeline continues.

## Intended Use

Metis-1.4 Base is intended for research, evaluation, and downstream training. It is not instruction tuned and should not be treated as an aligned assistant. For interactive use, prefer the post-trained Metis-1.4 chat/think checkpoints once released.

## About Lernex

Lernex is building learning systems that adapt around the learner: tutoring, practice, explanations, memory, and model research shaped around education. Metis-1.4 is a pivotal step in the Metis research line toward a compact, efficient model stack that can be trained, inspected, deployed, and improved end to end.