Zenyx-v2 Instruct
Zenyx-v2 Instruct is the supervised fine-tuned instruction-following variant of the Zenyx v2 base model. It is trained with a strict assistant-only masking setup and a ChatML-style conversation template for dialogue, instruction following, code, reasoning, and tool-oriented training data. The current release is still in active training, so evaluation metrics, benchmark scores, and final qualitative results will be added later.
Model summary
- Base model:
Arko007/zenyx-v2-base - Tokenizer:
Arko007/zenyx-v2-tokenizer - Model family: Zenyx v2
- Training stage: Supervised fine-tuning
- Context length: 8,192 tokens
- Architecture: Recurrent Transformer-style block with MLAAttention, ConvSwiGLU, RMSNorm, YaRN RoPE scaling, and tied input/output embeddings
- Framework: JAX + Flax + Optax
- Target hardware: TPU v5e-8
Intended use
This checkpoint is intended for:
- chat and instruction following
- reasoning and math-heavy responses
- code generation and code repair
- tool-oriented conversational workflows
- long-context generation within the supported sequence length
Training data
The SFT pipeline uses a large mixed instruction corpus assembled from multiple public datasets, including reasoning, math, code, tool use, safety, and conversation sources. The training script currently registers 57 dataset streams, with examples such as NVIDIA Nemotron SFT blends, OpenMath/MathInstruct-style corpora, code-reasoning datasets, agentic/tool-calling data, and conversational instruction sets. Data is normalized into a unified message format and converted to ChatML turns before packing into fixed-length blocks. fileciteturn0file0
Architecture
Zenyx-v2 Instruct uses a compact but high-capacity recurrent block design:
- token embeddings tied to the final logits layer
- MLA-style attention with separate query and key/value latent projections
- grouped key/value heads
- depthwise causal convolution inside the feed-forward path via ConvSwiGLU
- RMSNorm throughout the stack
- YaRN-scaled rotary positional embeddings for extended context stability
- repeated application of a small set of unique blocks across multiple recurrences
The script instantiates the model with 576 hidden width, 9 attention heads, 3 KV heads, 8 unique blocks, 4 recurrences, and 3 auxiliary MTP heads. fileciteturn0file0
Training setup
The SFT recipe in the script uses:
- global batch size: 128 sequences
- per-core batch size: 1
- gradient accumulation: 16
- learning rate: 5e-5
- minimum LR: 1e-6
- AdamW optimizer with beta1 0.9, beta2 0.95, epsilon 1e-8, and weight decay 0.01
- warmup: 500 steps
- cosine schedule horizon: 50,000 steps
- evaluation every 500 steps
- checkpointing every 500 steps
The loss is computed with assistant-only masking, so user and system tokens are excluded from optimization targets. fileciteturn0file0
Training status
This model is still in training. Final metrics, validation curves, benchmark numbers, and comparative evals will be added once training completes.
How to use
Use the tokenizer’s ChatML template and the model’s expected special tokens:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain backpropagation simply."},
]
For best results, format prompts as conversation turns rather than plain text completions.
Limitations
- Final benchmark quality is not yet known because training is ongoing.
- Performance may vary across math, code, safety, and tool-use tasks depending on the data mixture and training progress.
- The model was trained with a long-context setup, but practical results on very long prompts should still be validated after training completes.
Safety
This model may generate incorrect, incomplete, or overly confident responses. Treat outputs as assistance, not ground truth. Apply application-level filtering and human review where needed.
Citation
If you use this checkpoint in a publication, project, or demo, cite the Zenyx-v2 Instruct repository and include the final model card once metrics are available.
Status note
Metrics, evaluation tables, and sample generations will be appended after the current training run finishes.
Model tree for Arko007/zenyx-v2-instruct
Base model
Arko007/zenyx-v2-base