Update README.md
Browse files
README.md
CHANGED
|
@@ -11,12 +11,44 @@ tags:
|
|
| 11 |
- trl
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
|
| 16 |
-
-
|
| 17 |
-
- **License:** apache-2.0
|
| 18 |
-
- **Finetuned from model :** Xclbr7/Arcanum-12b
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
-
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
|
| 11 |
- trl
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Aether-12b
|
| 15 |
|
| 16 |
+
Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.
|
|
|
|
|
|
|
| 17 |
|
| 18 |
+
## Model Details ๐
|
| 19 |
+
- Developed by: AIXON Lab
|
| 20 |
+
- Model type: Causal Language Model
|
| 21 |
+
- Language(s): English (primarily), may support other languages
|
| 22 |
+
- License: apache-2.0
|
| 23 |
+
- Repository: https://huggingface.co/aixonlab/Aether-12b
|
| 24 |
+
|
| 25 |
+
## Model Architecture ๐๏ธ
|
| 26 |
+
- Base model: Arcanum-12b
|
| 27 |
+
- Parameter count: ~12 billion
|
| 28 |
+
- Architecture specifics: Transformer-based language model
|
| 29 |
+
|
| 30 |
+
## Open LLM Leaderboard Evaluation Results
|
| 31 |
+
Coming Soon !
|
| 32 |
+
|
| 33 |
+
## Training & Fine-tuning ๐
|
| 34 |
+
Aether-12b was fine-tuned on the following dataset:
|
| 35 |
+
- Dataset: theprint/CleverBoi-Data-20k
|
| 36 |
+
- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.
|
| 37 |
+
|
| 38 |
+
The CleverBoi-Data-20k dataset improved the model in the following ways:
|
| 39 |
+
1. Enhanced reasoning and problem-solving capabilities
|
| 40 |
+
2. Broader knowledge across various topics
|
| 41 |
+
3. Improved performance on specific tasks like writing, analysis, and problem-solving
|
| 42 |
+
4. Better contextual understanding and response generation
|
| 43 |
+
|
| 44 |
+
## Intended Use ๐ฏ
|
| 45 |
+
As an assistant or specific role bot.
|
| 46 |
+
|
| 47 |
+
## Ethical Considerations ๐ค
|
| 48 |
+
As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
## Acknowledgments ๐
|
| 52 |
+
We acknowledge the contributions of:
|
| 53 |
+
- theprint for the amazing CleverBoi-Data-20k dataset
|
| 54 |
|
|
|