Update README.md
Browse files
README.md
CHANGED
|
@@ -11,9 +11,9 @@ tags:
|
|
| 11 |
|
| 12 |
# A experts weights of [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
|
| 13 |
|
| 14 |
-
Required Weights for
|
| 15 |
|
| 16 |
-
The original model is **[AI21lab's Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)**, which requires an
|
| 17 |
- **Original Model:** [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
|
| 18 |
- **MoE Layer Separation**: Consult [this script](https://github.com/TechxGenus/Jamba-utils/blob/main/dense_downcycling.py) written by [@TechxGenusand](https://github.com/TechxGenusand) and use [TechxGenus/Jamba-v0.1-9B](https://huggingface.co/TechxGenus/Jamba-v0.1-9B).
|
| 19 |
|
|
|
|
| 11 |
|
| 12 |
# A experts weights of [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
|
| 13 |
|
| 14 |
+
Required Weights for follow-up research.
|
| 15 |
|
| 16 |
+
The original model is **[AI21lab's Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)**, which requires an **>80GB VRAM**. Unfortunately, this almonst was not available via Google Colab or cloud computing services. Thus, attempts were made to perform **MoE (Mixture of Experts) splitting**, using the following resources as a basis:
|
| 17 |
- **Original Model:** [Jamba-v0.1](https://huggingface.co/ai21labs/Jamba-v0.1)
|
| 18 |
- **MoE Layer Separation**: Consult [this script](https://github.com/TechxGenus/Jamba-utils/blob/main/dense_downcycling.py) written by [@TechxGenusand](https://github.com/TechxGenusand) and use [TechxGenus/Jamba-v0.1-9B](https://huggingface.co/TechxGenus/Jamba-v0.1-9B).
|
| 19 |
|