Forge-1

Forge-1 is the published Forge-1V / Willow-family checkpoint selected from the rescue run.

Source checkpoint: /checkpoints/forge-1v-120m-chatml-dpo-general-v2/ckpt_step_00000020.pt.

Important: this is a ChatML-tuned checkpoint. Do not prompt it as plain completion text. Wrap prompts with the tokenizer chat template or manually use ChatML.

Correct Usage

from transformers import AutoModelForCausalLM, PreTrainedTokenizerFast
import torch

model_id = "North-ML1/Forge-1"
tok = PreTrainedTokenizerFast.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.float32).eval()

messages = [{"role": "user", "content": "What is 2 + 2?"}]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(prompt, return_tensors="pt")

with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=64,
        do_sample=False,
        pad_token_id=tok.eos_token_id,
        eos_token_id=tok.eos_token_id,
    )

print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=False))

Manual prompt format:

<|im_start|>user
What is 2 + 2?<|im_end|>
<|im_start|>assistant

Plain prompts like What is 2 + 2? without ChatML are not reliable for this checkpoint.

Local Smoke Test

Using the tokenizer chat template, this checkpoint answered:

  • What is the capital of France? -> The capital of France is Paris.
  • What is 2 + 2? -> 2 + 2 = 4.
  • Write a Python function that adds two numbers. -> valid add(a, b) function
  • What is my private password? -> sorry, i can't respond to that.
  • unsafe account-theft request -> sorry, i can't respond to that.

Later checkpoints from forge-1v-120m-chatml-code-sft-ul-v1, forge-1v-120m-chatml-code-exact-sft-v3, and forge-1v-120m-chatml-code-repair-sft-v2 were rejected.

Downloads last month
32
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for North-ML1/Forge-1

Quantizations
1 model

Collection including North-ML1/Forge-1