Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,66 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- Open-Orca/OpenOrca
|
| 5 |
+
- microsoft/orca-math-word-problems-200k
|
| 6 |
+
- meta-math/MetaMathQA
|
| 7 |
+
language:
|
| 8 |
+
- en
|
| 9 |
+
tags:
|
| 10 |
+
- turbo
|
| 11 |
+
- conversational
|
| 12 |
+
- chicka
|
| 13 |
+
---
|
| 14 |
+
# TurboLM by Chickaboo AI
|
| 15 |
+
|
| 16 |
+
Welcome to TurboLM, the state-of-the-art language model developed by Chickaboo AI. TurboLM is designed to deliver a high-speed, low computing, and high-quality reasoning conversational experience.
|
| 17 |
+
|
| 18 |
+
## Table of Contents
|
| 19 |
+
- **Technical Details**
|
| 20 |
+
- **Training Details**
|
| 21 |
+
- **Benchmarks**
|
| 22 |
+
- **Usage**
|
| 23 |
+
- **License**
|
| 24 |
+
|
| 25 |
+
## Model Details
|
| 26 |
+
TurboLM utilizes a transformer-based architecture with the state of the art [Xenova/gpt-4o](https://huggingface.co/Xenova/gpt-4o) Tokenizer. The model has 150M parameters, making it high-speed and extremely efficient. This efficiency allows it to run on low-end devices while still delivering industry-best performance.
|
| 27 |
+
|
| 28 |
+
## Training Details
|
| 29 |
+
TurboLM was trained on these datasets with the presetage of the model they make up to the side datasets: [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) 75%, [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA?row=4) 15%, [microsoft/orca-math-word-problems-200k](microsoft/orca-math-word-problems-200k) 10% using this [Training Script]() in [Google Cloud](https://cloud.google.com/) with a T4 GPU for 2 days.
|
| 30 |
+
|
| 31 |
+
## OpenLLM Learderboards
|
| 32 |
+
|
| 33 |
+
| **Benchmark** | **TurboLM** | **Mistral-7B-Instruct-v0.2** | **Meta-Llama-3-8B** |
|
| 34 |
+
|--------------|----------------------|--------------------------|-----------------|
|
| 35 |
+
| **Average** | **69.19** | 60.97 | 62.55 |
|
| 36 |
+
| **ARC** | **64.08** | 59.98 | 59.47 |
|
| 37 |
+
| **Hellaswag** | **83.96** | 83.31 | 82.09 |
|
| 38 |
+
| **MMLU** | 64.87 | 64.16 | **66.67** |
|
| 39 |
+
| **TruthfulQA** | **50.51** | 42.15 | 43.95 |
|
| 40 |
+
| **Winogrande** | **81.06** | 78.37 | 77.35 |
|
| 41 |
+
| **GSM8K** | **70.66** | 37.83 | 45.79 |
|
| 42 |
+
|
| 43 |
+
## Usage
|
| 44 |
+
|
| 45 |
+
```python
|
| 46 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 47 |
+
|
| 48 |
+
device = "cuda" # the device to load the model onto
|
| 49 |
+
|
| 50 |
+
model = AutoModelForCausalLM.from_pretrained("Chickaboo/TurboLM")
|
| 51 |
+
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/TurboLM")
|
| 52 |
+
|
| 53 |
+
messages = [
|
| 54 |
+
{"role": "user", "content": "What is your favourite condiment?"},
|
| 55 |
+
{"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
|
| 56 |
+
{"role": "user", "content": "Do you have mayonnaise recipes?"}
|
| 57 |
+
]
|
| 58 |
+
|
| 59 |
+
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
| 60 |
+
|
| 61 |
+
model_inputs = encodeds.to(device)
|
| 62 |
+
model.to(device)
|
| 63 |
+
|
| 64 |
+
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
|
| 65 |
+
decoded = tokenizer.batch_decode(generated_ids)
|
| 66 |
+
print(decoded[0])
|