Instructions to use stabilityai/stablelm-3b-4e1t with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stabilityai/stablelm-3b-4e1t with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stabilityai/stablelm-3b-4e1t")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablelm-3b-4e1t") model = AutoModelForCausalLM.from_pretrained("stabilityai/stablelm-3b-4e1t") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use stabilityai/stablelm-3b-4e1t with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stabilityai/stablelm-3b-4e1t" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stabilityai/stablelm-3b-4e1t", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/stabilityai/stablelm-3b-4e1t
- SGLang
How to use stabilityai/stablelm-3b-4e1t with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stabilityai/stablelm-3b-4e1t" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stabilityai/stablelm-3b-4e1t", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stabilityai/stablelm-3b-4e1t" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stabilityai/stablelm-3b-4e1t", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use stabilityai/stablelm-3b-4e1t with Docker Model Runner:
docker model run hf.co/stabilityai/stablelm-3b-4e1t
Commit History
fix(README): remove `trust_remote_code` usage after `transformers==4.38.0` support 77fd07d verified
update(README): lift gate 9449c7d verified
fix(modeling): use correct `base_model_prefix` name e3be657 verified
feat: add dropout support b6e4fc1
fix: make `flash_attn` optional (`trust_remote_code` breaks dynamic module check) c24bc36
fix: remove `StableLMEpochForCausalLM.transformer` refs c6554ba
fix: rename incorrect access to model 5951257
fix: init proper term a4750ac
Upload folder using huggingface_hub 846682b
Jonathan Tow commited on