Marin-8B-Instruct SFT on TerminalCorpus

Marin-8B Instruct fine-tuned on nvidia/Nemotron-Terminal-Corpus (366K terminal agent trajectories).

Model Details

Parameter Value
Base model marin-community/marin-8b-instruct
Architecture Llama 3 8B (32 layers, 4096 hidden, 32 heads, 8 KV heads)
Tokenizer marin-community/marin-tokenizer
Training data nvidia/Nemotron-Terminal-Corpus (366K examples, all 4 subsets)
Epochs 2
Training steps 5,721
Batch size 128
Sequence length 32,768
Learning rate 2e-5 (cosine, 10% warmup)
Optimizer AdamW (β=0.9/0.95), grad_clip=1.0, wd=1e-4
TPU v5p-64
Final loss 0.442

Evaluation Results

Terminal-Bench 2.0

Model TB2 Accuracy
Marin-8B Instruct (no SFT) 0/89 = 0%
Marin-8B Instruct + TerminalCorpus SFT 1/89 = 1.1%
NemotronTerminal-8B (Qwen3-8B, paper) 13.0% ± 2.2
Marin Qwen3-8B SFT reproduction (exp3490b) 14/88 = 15.9%

TBLite Progression

Checkpoint TBLite
Step 1500 (26%) 1/100 = 1%
Step 3000 (52%) 5/100 = 5%

Training Details

Trained following the NemotronTerminal-8B paper hyperparameters. The model reaches a higher final loss (0.442 vs 0.360) as the Qwen3-8B reproduction but scores significantly lower on terminal benchmarks, likely due to architecture and tokenizer differences between Llama 3 and Qwen3.

Tracked in: marin-community/marin#4420

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AlienKevin/marin-8b-instruct-sft-terminalcorpus")
tokenizer = AutoTokenizer.from_pretrained("AlienKevin/marin-8b-instruct-sft-terminalcorpus")

License

Apache 2.0

Downloads last month
203
Safetensors
Model size
8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AlienKevin/marin-8b-instruct-sft-terminalcorpus

Finetuned
(1)
this model

Dataset used to train AlienKevin/marin-8b-instruct-sft-terminalcorpus

Paper for AlienKevin/marin-8b-instruct-sft-terminalcorpus