Vex Amber Mini 1.2

Vex Amber Mini License Parameters HumanEval

Model Description

Vex Amber Mini 1.2 is a 0.6B parameter decoder-only transformer model that demonstrates exceptional capabilities in mathematical reasoning and code generation. Building upon Vex Amber Mini 1.0, this model achieves state-of-the-art performance for its size class, particularly excelling in programming tasks and mathematical problem-solving.

  • Developed by: Arioron
  • Model type: Decoder-only Transformer
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: Arioron/Vex-Amber-Mini-1.0

Model Sources

Performance

Benchmark Metric Score
HumanEval Pass@1 21.34%
MBPP Pass@1 38.7%
GSM8K Accuracy 65.2%
MATH Accuracy 45.8%
MMLU Accuracy 58.3%

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Arioron/Vex-Amber-Mini-1.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Code generation example
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True,
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Capabilities

🎯 Code Generation

# Example: The model can generate efficient algorithms
def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)

πŸ”’ Mathematical Reasoning

# Example: Solve quadratic equations and explain steps
"""
Solve: xΒ² - 5x + 6 = 0
Step 1: Factor the equation: (x - 2)(x - 3) = 0
Step 2: Set each factor to zero: x - 2 = 0 or x - 3 = 0
Step 3: Solve for x: x = 2 or x = 3
"""

Training Details

Training Data

The model was trained on a carefully curated mixture of:

  • 45% Code (Python, JavaScript, Java, C++)
  • 30% Mathematical content (textbooks, problems, proofs)
  • 15% General reasoning tasks
  • 10% Conversational data

Technical Specifications

  • Architecture: Transformer-based decoder
  • Context Length: 8,192 tokens
  • Precision: float16
  • Training Framework: Native PyTorch
  • Positional Encoding: Rotary Positional Embeddings (RoPE)

Intended Uses

Direct Use

  • Code completion and generation
  • Mathematical problem solving
  • Educational assistance
  • Technical documentation
  • Research prototyping

Downstream Use

  • Integration into IDEs and code editors
  • Educational platforms
  • Technical chatbots
  • Research tools for mathematics and computer science

Limitations

  • The 0.6B parameter count may limit performance on extremely complex, multi-step reasoning tasks
  • While strong for its size, it may not match the performance of larger models (7B+) on some benchmarks
  • Context window of 8K tokens may be insufficient for very long code files or documents

Ethical Considerations

The model is trained on publicly available data and is designed to be helpful, harmless, and honest. However, as with any language model:

  • Outputs should be verified for accuracy in critical applications
  • The model should not be used for high-stakes decisions without human oversight
  • Users should be aware of potential biases in training data

Citation

If you use this model in your research, please cite:

@misc{vexambermini1.2,
  title = {Vex Amber Mini 1.2: A Compact Language Model for Code and Mathematics},
  author = {Arioron},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Mini-1.2}}
}

Contact

Acknowledgements

Thanks to the open-source community and the Qwen team for their foundational work. Special thanks to all contributors and researchers who have advanced the field of efficient language modeling.


For technical details, training recipes, and comprehensive evaluation results, please refer to our technical documentation.

Downloads last month
923
Safetensors
Model size
0.6B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Arioron/Vex-Amber-Mini-1.2

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(1)
this model
Finetunes
1 model
Quantizations
1 model

Space using Arioron/Vex-Amber-Mini-1.2 1