Instructions to use stanford-crfm/BioMedLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stanford-crfm/BioMedLM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stanford-crfm/BioMedLM")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("stanford-crfm/BioMedLM") model = AutoModelForCausalLM.from_pretrained("stanford-crfm/BioMedLM") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use stanford-crfm/BioMedLM with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stanford-crfm/BioMedLM" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stanford-crfm/BioMedLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/stanford-crfm/BioMedLM
- SGLang
How to use stanford-crfm/BioMedLM with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stanford-crfm/BioMedLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stanford-crfm/BioMedLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stanford-crfm/BioMedLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stanford-crfm/BioMedLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use stanford-crfm/BioMedLM with Docker Model Runner:
docker model run hf.co/stanford-crfm/BioMedLM
Commit History
Update README.md 6d1e633 verified
Update README.md cad400d
Update README.md c746cb4
Update README.md 40f622e
Update README.md e598a10
Update README.md ebd1370
Update README.md d3f5bcd
Update README.md ef2caef
Update README.md 3e33edb
Update README.md 72780fa
Update README.md 39fea4e
Update README.md 68d2933
Merge branch 'main' of https://huggingface.co/stanford-crfm/pubmedgpt into main 0a0cf47
J38 commited on
tokenizer files 9ccd482
J38 commited on
Update README.md a3e5fb6
Update README.md 4d18f91
Update README.md 4ec0841
Update README.md aa24ede
Update README.md a5c5c89
Update README.md fea58cd
Create README.md 2e9d8e5
add model and config aacf045
J38 commited on
initial commit c8e1e8b
Jason commited on