--- language: - en license: cc-by-nc-4.0 library_name: transformers tags: - code - math - reasoning - 0.6b pipeline_tag: text-generation base_model: - Arioron/Vex-Amber-Mini-1.0 --- # Vex Amber Mini 1.2 ![Vex Amber Mini](https://img.shields.io/badge/Vex-Amber_Mini_1.2-blue) ![License](https://img.shields.io/badge/License-Apache_2.0-green) ![Parameters](https://img.shields.io/badge/Parameters-0.6B-orange) ![HumanEval](https://img.shields.io/badge/HumanEval-21.34%25-brightgreen) ## Model Description **Vex Amber Mini 1.2** is a 0.6B parameter decoder-only transformer model that demonstrates exceptional capabilities in mathematical reasoning and code generation. Building upon Vex Amber Mini 1.0, this model achieves state-of-the-art performance for its size class, particularly excelling in programming tasks and mathematical problem-solving. - **Developed by:** Arioron - **Model type:** Decoder-only Transformer - **Language(s):** English - **License:** Apache 2.0 - **Finetuned from model:** [Arioron/Vex-Amber-Mini-1.0](https://huggingface.co/Arioron/Vex-Amber-Mini-1.0) ## Model Sources - **Base Model:** Qwen/Qwen3-0.6B - **Repository:** [https://huggingface.co/Arioron/Vex-Amber-Mini-1.2](https://huggingface.co/Arioron/Vex-Amber-Mini-1.2) - **Documentation:** [Arioron Model Docs](https://docs.arioron.com) ## Performance | Benchmark | Metric | Score | |-----------|--------|-------| | HumanEval | Pass@1 | 21.34% | | MBPP | Pass@1 | 38.7% | | GSM8K | Accuracy | 65.2% | | MATH | Accuracy | 45.8% | | MMLU | Accuracy | 58.3% | ## Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name = "Arioron/Vex-Amber-Mini-1.2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto" ) # Code generation example prompt = "Write a Python function to reverse a linked list:" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, top_p=0.9, pad_token_id=tokenizer.eos_token_id ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Capabilities ### 🎯 Code Generation ```python # Example: The model can generate efficient algorithms def quick_sort(arr): if len(arr) <= 1: return arr pivot = arr[len(arr) // 2] left = [x for x in arr if x < pivot] middle = [x for x in arr if x == pivot] right = [x for x in arr if x > pivot] return quick_sort(left) + middle + quick_sort(right) ``` ### 🔢 Mathematical Reasoning ```python # Example: Solve quadratic equations and explain steps """ Solve: x² - 5x + 6 = 0 Step 1: Factor the equation: (x - 2)(x - 3) = 0 Step 2: Set each factor to zero: x - 2 = 0 or x - 3 = 0 Step 3: Solve for x: x = 2 or x = 3 """ ``` ## Training Details ### Training Data The model was trained on a carefully curated mixture of: - 45% Code (Python, JavaScript, Java, C++) - 30% Mathematical content (textbooks, problems, proofs) - 15% General reasoning tasks - 10% Conversational data ### Technical Specifications - Architecture: Transformer-based decoder - Context Length: 8,192 tokens - Precision: float16 - Training Framework: Native PyTorch - Positional Encoding: Rotary Positional Embeddings (RoPE) ## Intended Uses ### Direct Use - Code completion and generation - Mathematical problem solving - Educational assistance - Technical documentation - Research prototyping ### Downstream Use - Integration into IDEs and code editors - Educational platforms - Technical chatbots - Research tools for mathematics and computer science ## Limitations - The 0.6B parameter count may limit performance on extremely complex, multi-step reasoning tasks - While strong for its size, it may not match the performance of larger models (7B+) on some benchmarks - Context window of 8K tokens may be insufficient for very long code files or documents ## Ethical Considerations The model is trained on publicly available data and is designed to be helpful, harmless, and honest. However, as with any language model: - Outputs should be verified for accuracy in critical applications - The model should not be used for high-stakes decisions without human oversight - Users should be aware of potential biases in training data ## Citation If you use this model in your research, please cite: ```bibtex @misc{vexambermini1.2, title = {Vex Amber Mini 1.2: A Compact Language Model for Code and Mathematics}, author = {Arioron}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/Arioron/Vex-Amber-Mini-1.2}} } ``` ## Contact - Email: inquiry@arioron.com - Website: https://arioron.com - Documentation: https://docs.arioron.com ## Acknowledgements Thanks to the open-source community and the Qwen team for their foundational work. Special thanks to all contributors and researchers who have advanced the field of efficient language modeling. --- For technical details, training recipes, and comprehensive evaluation results, please refer to our technical documentation.