Instructions to use Unbabel/TowerInstruct-7B-v0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Unbabel/TowerInstruct-7B-v0.1 with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="Unbabel/TowerInstruct-7B-v0.1")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Unbabel/TowerInstruct-7B-v0.1") model = AutoModelForCausalLM.from_pretrained("Unbabel/TowerInstruct-7B-v0.1") - Notebooks
- Google Colab
- Kaggle
Batch inference slower as compared to single inferences
#6
by gauranshsoni12 - opened
Hi I ran the model over a sequence of prompts using the batch_size param of 32, GPU Specs: Nvidia A10G 4x24Gb, even tho Nvidia-smi shows heavy gpu utilization but results are not coming up, rather a loop over the sequence generates faster results