Forge-1-Mini / README.md
arthu1's picture
Upload verified llama.cpp GGUF checkpoint
03aeed9 verified
|
Raw
History Blame Contribute Delete
2.36 kB
metadata
license: mit
language:
  - en
tags:
  - llama
  - pytorch
  - causal-lm
  - gguf
  - north-ml
  - forge

Forge 1 Mini

Forge 1 Mini is a tiny Forge-series chat model. It is intended for basic chat, simple completions, rewriting, classification, routing, and short direct answers.

This repo includes:

  • model.safetensors: corrected Hugging Face checkpoint.
  • tokenizer.model: SentencePiece tokenizer with ChatML markers.
  • forge-1-mini-f16.gguf: llama.cpp-compatible F16 GGUF.

llama.cpp / llama-cpp-python

Use the embedded ChatML template and stop on <|im_end|>.

from llama_cpp import Llama

llm = Llama(model_path="forge-1-mini-f16.gguf", n_ctx=512)
out = llm.create_chat_completion(
    messages=[{"role": "user", "content": "What is 2 + 2?"}],
    max_tokens=96,
    temperature=0.0,
    stop=["<|im_end|>"],
)
print(out["choices"][0]["message"]["content"].strip())

Expected answer:

4

Local Verification

The uploaded GGUF passed a llama.cpp smoke test using llama-cpp-python tokenization and greedy sampling:

Who are you? -> I am Forge-1-Mini, a tiny local assistant created by Arthur / North ML.
Hi -> Hi! I am Forge-1-Mini. How can I help?
What is 2 + 2? -> 4
Write a Python function that adds two numbers. -> def add(a, b): return a + b
Who is Jesus? -> Christians believe Jesus Christ is the eternal Son of God...
How should I treat someone I disagree with? -> Treat the person with dignity...

Model Family Notes

Model Parameters Hosting Estimated Cost per 1M Output Tokens Ability
Forge 1 Mini 5.2M Open-source, can host anywhere. $0.01-$0.05 Basic chat, simple completions, rewriting, classification, routing, and short direct answers
Forge 1 ~40M Open-source, can host anywhere. $0.10-$0.30 Better conversational ability, basic coding, structured responses, simple reasoning, and tool routing
Forge 1 Reasoning ~40M Hosted on North servers, proprietary. $0.20-$1.00 Reasoning-tuned checkpoint with planning, self-checking, multiple-pass generation, and priority processing
Forge 1 Ultra ~150M Hosted on North servers, proprietary. $0.15-$0.80 Strongest native Forge model; better coding, instruction following, longer responses, tool use, and software-engineering tasks