Vex-Amber-Mini-1.2 / README.md
SafwatWho's picture
Update README.md
2ffbe65 verified
---
language:
- en
license: cc-by-nc-4.0
library_name: transformers
tags:
- code
- math
- reasoning
- 0.6b
pipeline_tag: text-generation
base_model:
- Arioron/Vex-Amber-Mini-1.0
---
# Vex Amber Mini 1.2
![Vex Amber Mini](https://img.shields.io/badge/Vex-Amber_Mini_1.2-blue)
![License](https://img.shields.io/badge/License-Apache_2.0-green)
![Parameters](https://img.shields.io/badge/Parameters-0.6B-orange)
![HumanEval](https://img.shields.io/badge/HumanEval-21.34%25-brightgreen)
## Model Description
**Vex Amber Mini 1.2** is a 0.6B parameter decoder-only transformer model that demonstrates exceptional capabilities in mathematical reasoning and code generation. Building upon Vex Amber Mini 1.0, this model achieves state-of-the-art performance for its size class, particularly excelling in programming tasks and mathematical problem-solving.
- **Developed by:** Arioron
- **Model type:** Decoder-only Transformer
- **Language(s):** English
- **License:** Apache 2.0
- **Finetuned from model:** [Arioron/Vex-Amber-Mini-1.0](https://huggingface.co/Arioron/Vex-Amber-Mini-1.0)
## Model Sources
- **Base Model:** Qwen/Qwen3-0.6B
- **Repository:** [https://huggingface.co/Arioron/Vex-Amber-Mini-1.2](https://huggingface.co/Arioron/Vex-Amber-Mini-1.2)
- **Documentation:** [Arioron Model Docs](https://docs.arioron.com)
## Performance
| Benchmark | Metric | Score |
|-----------|--------|-------|
| HumanEval | Pass@1 | 21.34% |
| MBPP | Pass@1 | 38.7% |
| GSM8K | Accuracy | 65.2% |
| MATH | Accuracy | 45.8% |
| MMLU | Accuracy | 58.3% |
## Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Arioron/Vex-Amber-Mini-1.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Code generation example
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Capabilities
### 🎯 Code Generation
```python
# Example: The model can generate efficient algorithms
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
```
### 🔢 Mathematical Reasoning
```python
# Example: Solve quadratic equations and explain steps
"""
Solve: x² - 5x + 6 = 0
Step 1: Factor the equation: (x - 2)(x - 3) = 0
Step 2: Set each factor to zero: x - 2 = 0 or x - 3 = 0
Step 3: Solve for x: x = 2 or x = 3
"""
```
## Training Details
### Training Data
The model was trained on a carefully curated mixture of:
- 45% Code (Python, JavaScript, Java, C++)
- 30% Mathematical content (textbooks, problems, proofs)
- 15% General reasoning tasks
- 10% Conversational data
### Technical Specifications
- Architecture: Transformer-based decoder
- Context Length: 8,192 tokens
- Precision: float16
- Training Framework: Native PyTorch
- Positional Encoding: Rotary Positional Embeddings (RoPE)
## Intended Uses
### Direct Use
- Code completion and generation
- Mathematical problem solving
- Educational assistance
- Technical documentation
- Research prototyping
### Downstream Use
- Integration into IDEs and code editors
- Educational platforms
- Technical chatbots
- Research tools for mathematics and computer science
## Limitations
- The 0.6B parameter count may limit performance on extremely complex, multi-step reasoning tasks
- While strong for its size, it may not match the performance of larger models (7B+) on some benchmarks
- Context window of 8K tokens may be insufficient for very long code files or documents
## Ethical Considerations
The model is trained on publicly available data and is designed to be helpful, harmless, and honest. However, as with any language model:
- Outputs should be verified for accuracy in critical applications
- The model should not be used for high-stakes decisions without human oversight
- Users should be aware of potential biases in training data
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{vexambermini1.2,
title = {Vex Amber Mini 1.2: A Compact Language Model for Code and Mathematics},
author = {Arioron},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Mini-1.2}}
}
```
## Contact
- Email: inquiry@arioron.com
- Website: https://arioron.com
- Documentation: https://docs.arioron.com
## Acknowledgements
Thanks to the open-source community and the Qwen team for their foundational work. Special thanks to all contributors and researchers who have advanced the field of efficient language modeling.
---
For technical details, training recipes, and comprehensive evaluation results, please refer to our technical documentation.