Instructions to use nferruz/ProtGPT2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nferruz/ProtGPT2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nferruz/ProtGPT2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("nferruz/ProtGPT2") model = AutoModelForCausalLM.from_pretrained("nferruz/ProtGPT2") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use nferruz/ProtGPT2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nferruz/ProtGPT2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nferruz/ProtGPT2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/nferruz/ProtGPT2
- SGLang
How to use nferruz/ProtGPT2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nferruz/ProtGPT2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nferruz/ProtGPT2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nferruz/ProtGPT2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nferruz/ProtGPT2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use nferruz/ProtGPT2 with Docker Model Runner:
docker model run hf.co/nferruz/ProtGPT2
Embedding sequences
Hello,
Thanks for making this model available!
I have been trying to embed sequences (of different lengths), by using the following code:
inputs = tokenizer(['CASSPRAGGITDTQYF', 'CASSLLQPFGTEAFF'], return_tensors="pt", padding=True)
outputs = model(**inputs, output_hidden_states=True)
embeddings = outputs.hidden_states[0] #embedding before final fc layer
The two exemplary sequences have a different length and give a different number of tokens. Hence, padding is needed (padding=True).
However, I get the following error:
ValueError: Asking to pad but the tokenizer does not have a padding token.
This makes me think that padding was not used at training time, as the tokenizer does not have a padding token.
How did you concatenate proteins of different lengths to create a batch at training time without padding?
Thanks for your help.
Hi flpgrz,
I did not pad in ProtGTP2 because the sequences were truncated across groups. This is something I did not like and modified in ZymCTRL, which has a padding token.
In any case, I think you can add a padding token on the fly; could you try this?
if tokenizer.pad_token is None:
tokenizer.add_special_tokens({'pad_token': '[PAD]'})
This issue could also be useful: https://github.com/huggingface/transformers/issues/3021
I understand. Thanks for clarifying.
I might be wrong, but I think adding the padding token in the tokeniser step might not work, because the model does not know how to process it. But I should first try.
What I did so far is to embed one sequence at a time and do 0-padding afterwards to account for the different lengths
Yes, I think you are right. I think they talk about this issue in the GitHub issue I sent: https://github.com/huggingface/transformers/issues/3021
But I haven't tested it myself. Let me know if it works!