Instructions to use ISTA-DASLab/switch-large-128_qmoe with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ISTA-DASLab/switch-large-128_qmoe with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("ISTA-DASLab/switch-large-128_qmoe") model = AutoModelForSeq2SeqLM.from_pretrained("ISTA-DASLab/switch-large-128_qmoe") - Notebooks
- Google Colab
- Kaggle
Quick Links
switch-large-128_qmoe
This is the google/switch-large-128 model quantized with the QMoE framework to ternary precision and stored in the custom further compressed QMoE format.
Please see the QMoE repository for how to use this model.
- Downloads last month
- 10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("ISTA-DASLab/switch-large-128_qmoe") model = AutoModelForSeq2SeqLM.from_pretrained("ISTA-DASLab/switch-large-128_qmoe")