---
license: mit
language:
  - en
tags:
- llama
- pytorch
- causal-lm
- gguf
- north-ml
- forge
---

## Forge 1 Mini

Forge 1 Mini is a tiny Forge-series chat model. It is intended for basic chat, simple completions, rewriting, classification, routing, and short direct answers.

This repo includes:

- `model.safetensors`: corrected Hugging Face checkpoint.
- `tokenizer.model`: SentencePiece tokenizer with ChatML markers.
- `forge-1-mini-f16.gguf`: llama.cpp-compatible F16 GGUF.

### llama.cpp / llama-cpp-python

Use the embedded ChatML template and stop on `<|im_end|>`.

```python
from llama_cpp import Llama

llm = Llama(model_path="forge-1-mini-f16.gguf", n_ctx=512)
out = llm.create_chat_completion(
    messages=[{"role": "user", "content": "What is 2 + 2?"}],
    max_tokens=96,
    temperature=0.0,
    stop=["<|im_end|>"],
)
print(out["choices"][0]["message"]["content"].strip())
```

Expected answer:

```text
4
```

### Local Verification

The uploaded GGUF passed a llama.cpp smoke test using llama-cpp-python tokenization and greedy sampling:

```text
Who are you? -> I am Forge-1-Mini, a tiny local assistant created by Arthur / North ML.
Hi -> Hi! I am Forge-1-Mini. How can I help?
What is 2 + 2? -> 4
Write a Python function that adds two numbers. -> def add(a, b): return a + b
Who is Jesus? -> Christians believe Jesus Christ is the eternal Son of God...
How should I treat someone I disagree with? -> Treat the person with dignity...
```

## Model Family Notes

| Model | Parameters | Hosting | Estimated Cost per 1M Output Tokens | Ability |
|---|---:|---|---:|---|
| **Forge 1 Mini** | 5.2M | Open-source, can host anywhere. | **$0.01-$0.05** | Basic chat, simple completions, rewriting, classification, routing, and short direct answers |
| **Forge 1** | ~40M | Open-source, can host anywhere. | **$0.10-$0.30** | Better conversational ability, basic coding, structured responses, simple reasoning, and tool routing |
| **Forge 1 Reasoning** | ~40M | Hosted on North servers, proprietary. | **$0.20-$1.00** | Reasoning-tuned checkpoint with planning, self-checking, multiple-pass generation, and priority processing |
| **Forge 1 Ultra** | ~150M | Hosted on North servers, proprietary. | **$0.15-$0.80** | Strongest native Forge model; better coding, instruction following, longer responses, tool use, and software-engineering tasks |