Jupiter

Jupiter-N-120B

Jupiter-N-120B is a post-trained variant of NVIDIA Nemotron-3-Super-120B-A12B, developed by Locai Labs. The N denotes the Nemotron base. It adds Welsh language capability and UK cultural grounding to the fully open Nemotron base while also training on agentic and coding traces sampled using entropy-based selection from the base model.

Jupiter is designed as a reproducible template for sovereign post-training: any nation can substitute its own cultural knowledge base, institutional corpora, and indigenous languages to produce a culturally grounded model from a shared open base.

Model Summary

Base Model NVIDIA Nemotron-3-Super-120B-A12B
Total Parameters 120B (12B active)
Architecture LatentMoE (Mamba-2 + MoE + Attention hybrid) with Multi-Token Prediction
Post-Training Method LoRA (rank 16, alpha 32) with experience replay
Context Length Up to 1M tokens
Supported Languages English, French, German, Italian, Japanese, Spanish, Chinese + Welsh
Reasoning Configurable on/off via chat template (enable_thinking=True/False)
License NVIDIA Nemotron Open Model License
Developer Locai Labs
Release Date April 2026

What's New vs. Nemotron Base

  • Welsh language: trained on professional parallel corpora from Bangor University (Senedd proceedings + UK legislation) and LLM-translated instruction-following data using a custom pipeline.
  • Agentic/terminal: Uncertainty-curated terminal trajectories from NVIDIA's Nemotron-Terminal-Corpus, selecting the 30k highest-entropy samples where the base model has the most to learn.
  • UK cultural grounding: CultureBank-informed synthetic data aligned to British cultural norms and conventions.
  • Synthetic Experience replay: Forget-Me-Not framework to mitigate catastrophic forgetting during post-training.

Benchmarks

We evaluate Jupiter against Nemotron-3-Super-120B (base) and Locai L1-Large. Additional evaluations including Arena Hard v2, Terminal Bench 2, and AgentHarm are currently in progress and will be added shortly.

Benchmark Metric Jupiter Nemotron Base L1-Large
IFEval prompt strict 80.96 79.85 86.51
IFBench prompt strict 57.5 50.7 43.5
GSM8K extract. match 93.63 95.91 94.92
Welsh ARC-Easy accuracy 72.00 54.00 92.00
Welsh MMLU-Lite accuracy 61.25 56.00 73.00

All values in %, reasoning disabled. Jupiter and Nemotron Base use temperature 1.0, top-p 0.95. L1-Large uses temperature 0.7, top-p 0.8.

Quick Start

Serving with vLLM

pip install vllm>=0.18.1

vllm serve locailabs/Jupiter-N-120B \
  --served-model-name locailabs/Jupiter-N-120B \
  --dtype auto \
  --kv-cache-dtype fp8 \
  --tensor-parallel-size 8 \
  --max-model-len 262144 \
  --enable-expert-parallel \
  --trust-remote-code \
  --gpu-memory-utilization 0.9 \
  --enable-chunked-prefill \
  --mamba-ssm-cache-dtype float16 \
  --reasoning-parser nemotron_v3 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder

DGX Spark (2x B200): Set --tensor-parallel-size 2 and remove --enable-expert-parallel.

API Client

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
MODEL = "locailabs/Jupiter-N-120B"

# Reasoning ON (default)
response = client.chat.completions.create(
    model=MODEL,
    messages=[{"role": "user", "content": "Esboniwch hanes y Senedd yn Gymraeg."}],
    max_tokens=16000,
    temperature=1.0,
    top_p=0.95,
    extra_body={"chat_template_kwargs": {"enable_thinking": True}},
)
print(response.choices[0].message.content)

# Reasoning OFF
response = client.chat.completions.create(
    model=MODEL,
    messages=[{"role": "user", "content": "What is the capital of Wales?"}],
    max_tokens=16000,
    temperature=1.0,
    top_p=0.95,
    extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
print(response.choices[0].message.content)

Training

Post-Training Data

Jupiter is fine-tuned on a curated mixture of ten datasets spanning five domains:

Dataset Domain N
Terminal trajectories Terminal 30k
CultureBank DPO Cultural 1.41k
Self-cognition Identity 2k
Synthetic replay (reasoning) Replay 2.38k
Synthetic replay (no reasoning) Replay 5.82k
Welsh chat Welsh IF 20k
Welsh legislation Welsh law 17.9k
Senedd proceedings Welsh parl. 19.6k
Nemotron IF Chat Instruction following 15k
Extended reasoning Reasoning 2.06k

All datasets are available under the locailabs HuggingFace organisation, except NVIDIA's Nemotron IF Chat which is available at its original source. The Extended reasoning dataset is derived from RamAnanth1/Nemotron3-Super-Reasoning-2000x.

Training Configuration

Method LoRA (rank 16, alpha 32)
Epochs 1
Framework NeMo AutoModel
Parallelism FSDP2 + Expert Parallelism (EP=8)
Hardware 8x NVIDIA H200 GPUs
Batch size 64 (global), 8 (local)
Sequence length 2,048
Optimiser Adam (beta1=0.9, beta2=0.999)
Learning rate 1e-5 to 1e-6 (cosine decay)
Excluded layers Mamba out_proj (incompatible custom kernels)

Key Techniques

  • Uncertainty-based data curation: Terminal trajectories selected by Shannon entropy of the base model's predictive distribution, retaining the 30k samples where the model is most uncertain.
  • Experience replay (Forget-Me-Not): Synthetic replay data generated by the unmodified base model on UltraChat prompts, preserving existing capabilities during domain-specific fine-tuning.
  • Welsh parallel corpora: Professional translations from Senedd (Welsh Parliament) proceedings and UK legislation, processed through a three-stage pipeline (cleaning, deduplication, instruction formatting).

Limitations

  • Welsh evaluation relies on adapted English-origin benchmarks (ARC-Easy, MMLU) rather than native Welsh NLU tasks.
  • Cultural grounding has not been validated through human evaluation.
  • Self-cognition data is teacher-generated and may not generalise to adversarial identity probing.

Ethical Considerations

Jupiter is motivated by the principle that nations and linguistic communities should be able to adapt open foundation models to their own needs without dependence on proprietary systems. Welsh language support contributes to the digital vitality of a minority language with approximately 880,000 speakers.

Model outputs in Welsh have not undergone extensive human quality review. We encourage downstream users to apply domain-appropriate human review before deployment in high-stakes domains such as legal or medical text.

Citation

@techreport{drayson2026jupiter_n_120b,
  title   = {Jupiter-N-120B Technical Report},
  author  = {George Drayson},
  year    = {2026},
  institution = {Locai Labs},
  url     = {https://huggingface.co/locailabs/Jupiter-N-120B}
}

Acknowledgements

Jupiter builds on NVIDIA Nemotron-3-Super. Welsh parallel corpora are sourced from Techiaith (Bangor University). Cultural data is informed by CultureBank. The Extended reasoning dataset is derived from RamAnanth1/Nemotron3-Super-Reasoning-2000x.

Downloads last month
70
Safetensors
Model size
121B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for locailabs/Jupiter-N-120B

Finetuned
(6)
this model

Collection including locailabs/Jupiter-N-120B