Vex Amber Mini 1.2
Model Description
Vex Amber Mini 1.2 is a 0.6B parameter decoder-only transformer model that demonstrates exceptional capabilities in mathematical reasoning and code generation. Building upon Vex Amber Mini 1.0, this model achieves state-of-the-art performance for its size class, particularly excelling in programming tasks and mathematical problem-solving.
- Developed by: Arioron
- Model type: Decoder-only Transformer
- Language(s): English
- License: Apache 2.0
- Finetuned from model: Arioron/Vex-Amber-Mini-1.0
Model Sources
- Base Model: Qwen/Qwen3-0.6B
- Repository: https://huggingface.co/Arioron/Vex-Amber-Mini-1.2
- Documentation: Arioron Model Docs
Performance
| Benchmark | Metric | Score |
|---|---|---|
| HumanEval | Pass@1 | 21.34% |
| MBPP | Pass@1 | 38.7% |
| GSM8K | Accuracy | 65.2% |
| MATH | Accuracy | 45.8% |
| MMLU | Accuracy | 58.3% |
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Arioron/Vex-Amber-Mini-1.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Code generation example
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Capabilities
π― Code Generation
# Example: The model can generate efficient algorithms
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
π’ Mathematical Reasoning
# Example: Solve quadratic equations and explain steps
"""
Solve: xΒ² - 5x + 6 = 0
Step 1: Factor the equation: (x - 2)(x - 3) = 0
Step 2: Set each factor to zero: x - 2 = 0 or x - 3 = 0
Step 3: Solve for x: x = 2 or x = 3
"""
Training Details
Training Data
The model was trained on a carefully curated mixture of:
- 45% Code (Python, JavaScript, Java, C++)
- 30% Mathematical content (textbooks, problems, proofs)
- 15% General reasoning tasks
- 10% Conversational data
Technical Specifications
- Architecture: Transformer-based decoder
- Context Length: 8,192 tokens
- Precision: float16
- Training Framework: Native PyTorch
- Positional Encoding: Rotary Positional Embeddings (RoPE)
Intended Uses
Direct Use
- Code completion and generation
- Mathematical problem solving
- Educational assistance
- Technical documentation
- Research prototyping
Downstream Use
- Integration into IDEs and code editors
- Educational platforms
- Technical chatbots
- Research tools for mathematics and computer science
Limitations
- The 0.6B parameter count may limit performance on extremely complex, multi-step reasoning tasks
- While strong for its size, it may not match the performance of larger models (7B+) on some benchmarks
- Context window of 8K tokens may be insufficient for very long code files or documents
Ethical Considerations
The model is trained on publicly available data and is designed to be helpful, harmless, and honest. However, as with any language model:
- Outputs should be verified for accuracy in critical applications
- The model should not be used for high-stakes decisions without human oversight
- Users should be aware of potential biases in training data
Citation
If you use this model in your research, please cite:
@misc{vexambermini1.2,
title = {Vex Amber Mini 1.2: A Compact Language Model for Code and Mathematics},
author = {Arioron},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Mini-1.2}}
}
Contact
- Email: inquiry@arioron.com
- Website: https://arioron.com
- Documentation: https://docs.arioron.com
Acknowledgements
Thanks to the open-source community and the Qwen team for their foundational work. Special thanks to all contributors and researchers who have advanced the field of efficient language modeling.
For technical details, training recipes, and comprehensive evaluation results, please refer to our technical documentation.
- Downloads last month
- 923