Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,7 @@ base_model:
|
|
| 6 |
- Qwen/Qwen3-0.6B
|
| 7 |
library_name: transformers
|
| 8 |
---
|
| 9 |
-
# Vex-Amber-Mini 1.
|
| 10 |
|
| 11 |
> ⚡ **World Record Holder: Most Parameter-Efficient Sub-1B Language Model**
|
| 12 |
|
|
@@ -24,7 +24,7 @@ library_name: transformers
|
|
| 24 |
|
| 25 |
## Overview
|
| 26 |
|
| 27 |
-
**Vex-Amber-Mini 1.
|
| 28 |
|
| 29 |
- **Base Model:** [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
| 30 |
- **Fine-tuning:** LoRA with optional full-weight fine-tuning for enhanced adaptability
|
|
@@ -36,7 +36,7 @@ library_name: transformers
|
|
| 36 |
|
| 37 |
## Installation
|
| 38 |
|
| 39 |
-
To harness the power of **Vex-Amber-Mini 1.
|
| 40 |
|
| 41 |
```bash
|
| 42 |
pip install transformers torch
|
|
@@ -74,7 +74,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 74 |
|
| 75 |
- **Architecture:** Transformer-based, derived from [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
| 76 |
- **Training Data:** Fine-tuned on a curated dataset optimized for code generation and versatile text tasks
|
| 77 |
-
- **Performance:** Achieves a HumanEval Pass@1 score of **
|
| 78 |
- **Use Cases:** Ideal for code generation, text completion, and lightweight NLP applications
|
| 79 |
- **Context Length:** Supports up to 2048 tokens for efficient processing
|
| 80 |
|
|
@@ -82,11 +82,11 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 82 |
|
| 83 |
## Performance Metrics
|
| 84 |
|
| 85 |
-
The following table compares the HumanEval performance of **Vex-Amber-Mini 1.
|
| 86 |
|
| 87 |
| Model | Parameters | HumanEval Pass@1 | Notes |
|
| 88 |
|------------------------|------------|------------------|----------------------------------------------------------------|
|
| 89 |
-
| **Vex-Amber-Mini 1.0** | 0.6B |
|
| 90 |
| **Code Llama** | 7B | ~24% | Developed by Meta, optimized for code tasks. |
|
| 91 |
| **StarCoder** | 7B | ~25% | Developed by Hugging Face and ServiceNow, fine-tuned for code. |
|
| 92 |
| **CodeGen** | 6B | ~22% | Developed by Salesforce, optimized for code generation. |
|
|
@@ -130,4 +130,4 @@ For inquiries, collaboration, or to report issues, please visit:
|
|
| 130 |
|
| 131 |
## Contribute
|
| 132 |
|
| 133 |
-
We warmly welcome contributions! Please submit pull requests or issues via the [GitHub repository](https://github.com/Arioron-International/Vex-Amber-Mini-1.0) to help refine and elevate **Vex-Amber-Mini 1.
|
|
|
|
| 6 |
- Qwen/Qwen3-0.6B
|
| 7 |
library_name: transformers
|
| 8 |
---
|
| 9 |
+
# Vex-Amber-Mini 1.1
|
| 10 |
|
| 11 |
> ⚡ **World Record Holder: Most Parameter-Efficient Sub-1B Language Model**
|
| 12 |
|
|
|
|
| 24 |
|
| 25 |
## Overview
|
| 26 |
|
| 27 |
+
**Vex-Amber-Mini 1.1** is a groundbreaking small language model (SLM) that holds the **world record for the most parameter-efficient model with fewer than 1 billion parameters**. Meticulously optimized for code generation and general-purpose text tasks, it delivers exceptional performance within a compact 0.6B parameter framework.
|
| 28 |
|
| 29 |
- **Base Model:** [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
| 30 |
- **Fine-tuning:** LoRA with optional full-weight fine-tuning for enhanced adaptability
|
|
|
|
| 36 |
|
| 37 |
## Installation
|
| 38 |
|
| 39 |
+
To harness the power of **Vex-Amber-Mini 1.1**, install the required dependencies:
|
| 40 |
|
| 41 |
```bash
|
| 42 |
pip install transformers torch
|
|
|
|
| 74 |
|
| 75 |
- **Architecture:** Transformer-based, derived from [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
| 76 |
- **Training Data:** Fine-tuned on a curated dataset optimized for code generation and versatile text tasks
|
| 77 |
+
- **Performance:** Achieves a HumanEval Pass@1 score of **20.12%**, setting a benchmark for sub-1B models and earning the title of the **most parameter-efficient sub-1B model**
|
| 78 |
- **Use Cases:** Ideal for code generation, text completion, and lightweight NLP applications
|
| 79 |
- **Context Length:** Supports up to 2048 tokens for efficient processing
|
| 80 |
|
|
|
|
| 82 |
|
| 83 |
## Performance Metrics
|
| 84 |
|
| 85 |
+
The following table compares the HumanEval performance of **Vex-Amber-Mini 1.1** against other code generation models. Note that scores for rival models are approximate, as indicated by "~", based on available benchmarks:
|
| 86 |
|
| 87 |
| Model | Parameters | HumanEval Pass@1 | Notes |
|
| 88 |
|------------------------|------------|------------------|----------------------------------------------------------------|
|
| 89 |
+
| **Vex-Amber-Mini 1.0** | 0.6B | 20.21% | Compact model optimized for code generation. |
|
| 90 |
| **Code Llama** | 7B | ~24% | Developed by Meta, optimized for code tasks. |
|
| 91 |
| **StarCoder** | 7B | ~25% | Developed by Hugging Face and ServiceNow, fine-tuned for code. |
|
| 92 |
| **CodeGen** | 6B | ~22% | Developed by Salesforce, optimized for code generation. |
|
|
|
|
| 130 |
|
| 131 |
## Contribute
|
| 132 |
|
| 133 |
+
We warmly welcome contributions! Please submit pull requests or issues via the [GitHub repository](https://github.com/Arioron-International/Vex-Amber-Mini-1.0) to help refine and elevate **Vex-Amber-Mini 1.1**.
|