Statuo commited on
Commit
dafead1
·
verified ·
1 Parent(s): 46fc46a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -23
README.md CHANGED
@@ -9,29 +9,12 @@ tags:
9
  - transformers
10
  ---
11
 
12
- # Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory via Unsloth!
13
 
14
- We have a Google Colab Tesla T4 notebook for Llama-3 8b here: https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?usp=sharing
 
15
 
16
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/u54VK8m8tk)
17
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/buy%20me%20a%20coffee%20button.png" width="200"/>](https://ko-fi.com/unsloth)
18
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
19
 
20
- ## Finetune for Free
21
-
22
- All notebooks are **beginner friendly**! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.
23
-
24
- | Unsloth supports | Free Notebooks | Performance | Memory use |
25
- |-----------------|--------------------------------------------------------------------------------------------------------------------------|-------------|----------|
26
- | **Llama-3 8b** | [▶️ Start on Colab](https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?usp=sharing) | 2.4x faster | 58% less |
27
- | **Gemma 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing) | 2.4x faster | 58% less |
28
- | **Mistral 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing) | 2.2x faster | 62% less |
29
- | **Llama-2 7b** | [▶️ Start on Colab](https://colab.research.google.com/drive/1lBzz5KeZJKXjvivbYvmGarix9Ao6Wxe5?usp=sharing) | 2.2x faster | 43% less |
30
- | **TinyLlama** | [▶️ Start on Colab](https://colab.research.google.com/drive/1AZghoNBQaMDgWJpi4RbffGM1h6raLUj9?usp=sharing) | 3.9x faster | 74% less |
31
- | **CodeLlama 34b** A100 | [▶️ Start on Colab](https://colab.research.google.com/drive/1y7A0AxE3y8gdj4AVkl2aZX47Xu3P1wJT?usp=sharing) | 1.9x faster | 27% less |
32
- | **Mistral 7b** 1xT4 | [▶️ Start on Kaggle](https://www.kaggle.com/code/danielhanchen/kaggle-mistral-7b-unsloth-notebook) | 5x faster\* | 62% less |
33
- | **DPO - Zephyr** | [▶️ Start on Colab](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) | 1.9x faster | 19% less |
34
-
35
- - This [conversational notebook](https://colab.research.google.com/drive/1Aau3lgPzeZKQ-98h69CCu1UJcvIBLmy2?usp=sharing) is useful for ShareGPT ChatML / Vicuna templates.
36
- - This [text completion notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing) is for raw text. This [DPO notebook](https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing) replicates Zephyr.
37
- - \* Kaggle has 2x T4s, but we use 1. Due to overhead, 1x T4 is 5x faster.
 
9
  - transformers
10
  ---
11
 
12
+ # I quanted this from the Unsloth upload for Mistral Nemo Instruct.
13
 
14
+ [You can find the link here](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
15
+ [This is for the base Mistral Nemo Instruct Model](nvidia/Mistral-NeMo-12B-Instruct)
16
 
17
+ EXL2 quanting seemed to work. I ran a few tests on it and it seemed to have zero issues generating text up to 32k context size. I did not try higher than that, but uploading so folks can start testing this. Pleasantly surprised for a roleplay capacity as it seemed to latch onto character traits very well.
 
 
18
 
19
+ [6BPW - Coming Soon](https://huggingface.co/)
20
+ [4BPW - Coming Soon](https://huggingface.co/)