--- license: mit language: - en tags: - llama - pytorch - causal-lm - gguf - north-ml - forge --- ## Forge 1 Mini Forge 1 Mini is a tiny Forge-series chat model. It is intended for basic chat, simple completions, rewriting, classification, routing, and short direct answers. This repo includes: - `model.safetensors`: corrected Hugging Face checkpoint. - `tokenizer.model`: SentencePiece tokenizer with ChatML markers. - `forge-1-mini-f16.gguf`: llama.cpp-compatible F16 GGUF. ### llama.cpp / llama-cpp-python Use the embedded ChatML template and stop on `<|im_end|>`. ```python from llama_cpp import Llama llm = Llama(model_path="forge-1-mini-f16.gguf", n_ctx=512) out = llm.create_chat_completion( messages=[{"role": "user", "content": "What is 2 + 2?"}], max_tokens=96, temperature=0.0, stop=["<|im_end|>"], ) print(out["choices"][0]["message"]["content"].strip()) ``` Expected answer: ```text 4 ``` ### Local Verification The uploaded GGUF passed a llama.cpp smoke test using llama-cpp-python tokenization and greedy sampling: ```text Who are you? -> I am Forge-1-Mini, a tiny local assistant created by Arthur / North ML. Hi -> Hi! I am Forge-1-Mini. How can I help? What is 2 + 2? -> 4 Write a Python function that adds two numbers. -> def add(a, b): return a + b Who is Jesus? -> Christians believe Jesus Christ is the eternal Son of God... How should I treat someone I disagree with? -> Treat the person with dignity... ``` ## Model Family Notes | Model | Parameters | Hosting | Estimated Cost per 1M Output Tokens | Ability | |---|---:|---|---:|---| | **Forge 1 Mini** | 5.2M | Open-source, can host anywhere. | **$0.01-$0.05** | Basic chat, simple completions, rewriting, classification, routing, and short direct answers | | **Forge 1** | ~40M | Open-source, can host anywhere. | **$0.10-$0.30** | Better conversational ability, basic coding, structured responses, simple reasoning, and tool routing | | **Forge 1 Reasoning** | ~40M | Hosted on North servers, proprietary. | **$0.20-$1.00** | Reasoning-tuned checkpoint with planning, self-checking, multiple-pass generation, and priority processing | | **Forge 1 Ultra** | ~150M | Hosted on North servers, proprietary. | **$0.15-$0.80** | Strongest native Forge model; better coding, instruction following, longer responses, tool use, and software-engineering tasks |