kaggle
bartowski/Llama-3.3-70B-Instruct-exl2
2.2 bits per weight
how to run it in kaggle on 2 gpu t4
Can you help me with the code to make it work?
Thank you.
how to run bartowski/QwQ-32B-exl2 in colab t4
i want code like
python XXXXXXX.py -m "/content/QwQ-32B-exl2-3_0" -p "hi"
!python examples/chat.py -m ../my_model2 -mode llama -cq4 -nfa -l 64
cash q4
Context length 64
without flash in colab t4
bartowski/QwQ-32B-exl2 Question: Can the model run in Colab T4? https://huggingface.co/bartowski/QwQ-32B-exl2/tree/3_0
Please edit the code
I don't think it would fit, 16GB won't be enough to load 3 bpw unless you use barely any context (which would be useless on a reasoning model)
https://github.com/kim90000/suc-QwQ-32B-exl2-3_0/blob/main/suc_QwQ_32B_exl2_3_0%20(1).ipynb
Can you try the page and how can I make the topic useful and good especially for those who are only 16gb vram