Instructions to use google/flan-ul2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/flan-ul2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("google/flan-ul2") model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-ul2") - Notebooks
- Google Colab
- Kaggle
Is it possible to run this model on the CPU?
#20
by vmajor - opened
I have CPU only pytorch, I set the following in my code:
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
tokenizer = AutoTokenizer.from_pretrained("google/flan-ul2")
model = T5ForConditionalGeneration.from_pretrained("google/flan-ul2")
model = model.to('cpu')
torch.device("cpu")
and I just did this in bash:
export TORCH_CUDA_ARCH_LIST=""
and it still tells me this even though I am actively trying to to everything to stop any calls to CUDA:
anaconda3/envs/transformers/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
I got the demo code to run on the CPU. I do not know yet why it does not want to do so when inside the actual code that I have for it, but it is clearly related to the code, not the model so it is up to me to fix.
vmajor changed discussion status to closed