---
language:
- en
license: cc-by-nc-4.0
library_name: transformers
tags:
- code
- math
- reasoning
- 0.6b
pipeline_tag: text-generation
base_model:
- Arioron/Vex-Amber-Mini-1.0
---

# Vex Amber Mini 1.2

![Vex Amber Mini](https://img.shields.io/badge/Vex-Amber_Mini_1.2-blue)
![License](https://img.shields.io/badge/License-Apache_2.0-green)
![Parameters](https://img.shields.io/badge/Parameters-0.6B-orange)
![HumanEval](https://img.shields.io/badge/HumanEval-21.34%25-brightgreen)

## Model Description

**Vex Amber Mini 1.2** is a 0.6B parameter decoder-only transformer model that demonstrates exceptional capabilities in mathematical reasoning and code generation. Building upon Vex Amber Mini 1.0, this model achieves state-of-the-art performance for its size class, particularly excelling in programming tasks and mathematical problem-solving.

- **Developed by:** Arioron
- **Model type:** Decoder-only Transformer
- **Language(s):** English
- **License:** Apache 2.0
- **Finetuned from model:** [Arioron/Vex-Amber-Mini-1.0](https://huggingface.co/Arioron/Vex-Amber-Mini-1.0)

## Model Sources

- **Base Model:** Qwen/Qwen3-0.6B
- **Repository:** [https://huggingface.co/Arioron/Vex-Amber-Mini-1.2](https://huggingface.co/Arioron/Vex-Amber-Mini-1.2)
- **Documentation:** [Arioron Model Docs](https://docs.arioron.com)

## Performance

| Benchmark | Metric | Score |
|-----------|--------|-------|
| HumanEval | Pass@1 | 21.34% |
| MBPP | Pass@1 | 38.7% |
| GSM8K | Accuracy | 65.2% |
| MATH | Accuracy | 45.8% |
| MMLU | Accuracy | 58.3% |

## Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Arioron/Vex-Amber-Mini-1.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Code generation example
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True,
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Capabilities

### 🎯 Code Generation
```python
# Example: The model can generate efficient algorithms
def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
```

### 🔢 Mathematical Reasoning
```python
# Example: Solve quadratic equations and explain steps
"""
Solve: x² - 5x + 6 = 0
Step 1: Factor the equation: (x - 2)(x - 3) = 0
Step 2: Set each factor to zero: x - 2 = 0 or x - 3 = 0
Step 3: Solve for x: x = 2 or x = 3
"""
```

## Training Details

### Training Data

The model was trained on a carefully curated mixture of:

- 45% Code (Python, JavaScript, Java, C++)
- 30% Mathematical content (textbooks, problems, proofs)
- 15% General reasoning tasks
- 10% Conversational data

### Technical Specifications

- Architecture: Transformer-based decoder
- Context Length: 8,192 tokens
- Precision: float16
- Training Framework: Native PyTorch
- Positional Encoding: Rotary Positional Embeddings (RoPE)

## Intended Uses

### Direct Use

- Code completion and generation
- Mathematical problem solving
- Educational assistance
- Technical documentation
- Research prototyping

### Downstream Use

- Integration into IDEs and code editors
- Educational platforms
- Technical chatbots
- Research tools for mathematics and computer science

## Limitations

- The 0.6B parameter count may limit performance on extremely complex, multi-step reasoning tasks
- While strong for its size, it may not match the performance of larger models (7B+) on some benchmarks
- Context window of 8K tokens may be insufficient for very long code files or documents

## Ethical Considerations

The model is trained on publicly available data and is designed to be helpful, harmless, and honest. However, as with any language model:

- Outputs should be verified for accuracy in critical applications
- The model should not be used for high-stakes decisions without human oversight
- Users should be aware of potential biases in training data

## Citation

If you use this model in your research, please cite:
```bibtex
@misc{vexambermini1.2,
  title = {Vex Amber Mini 1.2: A Compact Language Model for Code and Mathematics},
  author = {Arioron},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Mini-1.2}}
}
```

## Contact

- Email: inquiry@arioron.com
- Website: https://arioron.com
- Documentation: https://docs.arioron.com

## Acknowledgements

Thanks to the open-source community and the Qwen team for their foundational work. Special thanks to all contributors and researchers who have advanced the field of efficient language modeling.

---

For technical details, training recipes, and comprehensive evaluation results, please refer to our technical documentation.