North-ML1
/

Forge-1-Mini

Model card Files Files and versions

Forge-1-Mini / README.md

arthu1's picture

Upload verified llama.cpp GGUF checkpoint

03aeed9 verified 14 days ago

|

History Blame Contribute Delete

2.36 kB

	---
	license: mit
	language:
	- en
	tags:
	- llama
	- pytorch
	- causal-lm
	- gguf
	- north-ml
	- forge
	---

	## Forge 1 Mini

	Forge 1 Mini is a tiny Forge-series chat model. It is intended for basic chat, simple completions, rewriting, classification, routing, and short direct answers.

	This repo includes:

	- `model.safetensors`: corrected Hugging Face checkpoint.
	- `tokenizer.model`: SentencePiece tokenizer with ChatML markers.
	- `forge-1-mini-f16.gguf`: llama.cpp-compatible F16 GGUF.

	### llama.cpp / llama-cpp-python

	Use the embedded ChatML template and stop on `<\|im_end\|>`.

	```python
	from llama_cpp import Llama

	llm = Llama(model_path="forge-1-mini-f16.gguf", n_ctx=512)
	out = llm.create_chat_completion(
	messages=[{"role": "user", "content": "What is 2 + 2?"}],
	max_tokens=96,
	temperature=0.0,
	stop=["<\|im_end\|>"],
	)
	print(out["choices"][0]["message"]["content"].strip())
	```

	Expected answer:

	```text
	4
	```

	### Local Verification

	The uploaded GGUF passed a llama.cpp smoke test using llama-cpp-python tokenization and greedy sampling:

	```text
	Who are you? -> I am Forge-1-Mini, a tiny local assistant created by Arthur / North ML.
	Hi -> Hi! I am Forge-1-Mini. How can I help?
	What is 2 + 2? -> 4
	Write a Python function that adds two numbers. -> def add(a, b): return a + b
	Who is Jesus? -> Christians believe Jesus Christ is the eternal Son of God...
	How should I treat someone I disagree with? -> Treat the person with dignity...
	```

	## Model Family Notes

	\| Model \| Parameters \| Hosting \| Estimated Cost per 1M Output Tokens \| Ability \|
	\|---\|---:\|---\|---:\|---\|
	\| Forge 1 Mini \| 5.2M \| Open-source, can host anywhere. \| $0.01-$0.05 \| Basic chat, simple completions, rewriting, classification, routing, and short direct answers \|
	\| Forge 1 \| ~40M \| Open-source, can host anywhere. \| $0.10-$0.30 \| Better conversational ability, basic coding, structured responses, simple reasoning, and tool routing \|
	\| Forge 1 Reasoning \| ~40M \| Hosted on North servers, proprietary. \| $0.20-$1.00 \| Reasoning-tuned checkpoint with planning, self-checking, multiple-pass generation, and priority processing \|
	\| Forge 1 Ultra \| ~150M \| Hosted on North servers, proprietary. \| $0.15-$0.80 \| Strongest native Forge model; better coding, instruction following, longer responses, tool use, and software-engineering tasks \|