Doing the encoding using GPU

by Luning-Yang - opened May 1, 2023

May 1, 2023

I'm try to encode a massive amount of data using instructor. Here is what I did:

import torch
from transformers import AutoTokenizer
from InstructorEmbedding import INSTRUCTOR

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = INSTRUCTOR('hkunlp/instructor-large').to(device)
tokenizer = AutoTokenizer.from_pretrained('hkunlp/instructor-large')

However, I don't know how to properly convert the input data into tensors in order to use GPU for encoding. Could you elaborate on this?

multi-train

NLP Group of The University of Hong Kong org May 16, 2023

Hi, Thanks a lot for your interest in the INSTRUCTOR model!

You may need to move both models and encoding texts to the GPU.

Feel free to add any questions or comments!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment