Instructions to use ostorc/Conversational_Spanish_GPT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ostorc/Conversational_Spanish_GPT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ostorc/Conversational_Spanish_GPT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ostorc/Conversational_Spanish_GPT") model = AutoModelForCausalLM.from_pretrained("ostorc/Conversational_Spanish_GPT") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ostorc/Conversational_Spanish_GPT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ostorc/Conversational_Spanish_GPT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ostorc/Conversational_Spanish_GPT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ostorc/Conversational_Spanish_GPT
- SGLang
How to use ostorc/Conversational_Spanish_GPT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ostorc/Conversational_Spanish_GPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ostorc/Conversational_Spanish_GPT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ostorc/Conversational_Spanish_GPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ostorc/Conversational_Spanish_GPT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ostorc/Conversational_Spanish_GPT with Docker Model Runner:
docker model run hf.co/ostorc/Conversational_Spanish_GPT
Add default chat template to tokenizer_config.json
[Automated] This PR adds the default chat template to the tokenizer config, allowing the model to be used with the new conversational widget (see PR).
If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.
Hi Xenova,
Thank you very much for your Pull Request and for your interest in improving my model "Conversational_Spanish_GPT". I have carefully reviewed your suggestion to add a "chat_template" to the "tokenizer_config.json" file.
While I appreciate your time and effort, I would like to inform you that my model is single-turn, which means it is not trained to maintain conversational context. For this reason, the "chat_template" you propose would not be suitable. Using it could generate undesired or meaningless results, since the model does not have the ability to interpret conversation history.
I have considered alternatives such as not using a "chat_template" or creating a custom one for single-turn models. However, at this time, I believe the best option for my model is to not use a "chat_template".
I appreciate your understanding and cooperation. If you have any other suggestions or comments, please do not hesitate to let me know.