Instructions to use jhu-clsp/ettin-decoder-32m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jhu-clsp/ettin-decoder-32m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jhu-clsp/ettin-decoder-32m")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/ettin-decoder-32m") model = AutoModelForCausalLM.from_pretrained("jhu-clsp/ettin-decoder-32m") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use jhu-clsp/ettin-decoder-32m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jhu-clsp/ettin-decoder-32m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jhu-clsp/ettin-decoder-32m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jhu-clsp/ettin-decoder-32m
- SGLang
How to use jhu-clsp/ettin-decoder-32m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jhu-clsp/ettin-decoder-32m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jhu-clsp/ettin-decoder-32m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jhu-clsp/ettin-decoder-32m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jhu-clsp/ettin-decoder-32m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jhu-clsp/ettin-decoder-32m with Docker Model Runner:
docker model run hf.co/jhu-clsp/ettin-decoder-32m
Update pipeline tag and add library name for `ettin-decoder-32m`
#2
by nielsr HF Staff - opened
This PR improves the model card for the jhu-clsp/ettin-decoder-32m model by:
- Changing the
pipeline_tagfromfill-masktotext-generation, accurately reflecting its primary use case as a decoder-only model for generative tasks. This ensures the model appears under the correct filter on the Hugging Face Hub (https://huggingface.co/models?pipeline_tag=text-generation). - Adding
library_name: transformersto the metadata, enabling the "how to use" widget and better integration with the Hugging Face ecosystem, as the model is compatible with the library.
orionweller changed pull request status to merged