Instructions to use stanford-crfm/BioMedLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stanford-crfm/BioMedLM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stanford-crfm/BioMedLM")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("stanford-crfm/BioMedLM") model = AutoModelForCausalLM.from_pretrained("stanford-crfm/BioMedLM") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use stanford-crfm/BioMedLM with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stanford-crfm/BioMedLM" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stanford-crfm/BioMedLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/stanford-crfm/BioMedLM
- SGLang
How to use stanford-crfm/BioMedLM with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stanford-crfm/BioMedLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stanford-crfm/BioMedLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stanford-crfm/BioMedLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stanford-crfm/BioMedLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use stanford-crfm/BioMedLM with Docker Model Runner:
docker model run hf.co/stanford-crfm/BioMedLM
Add medical tag
#6
by davanstrien HF Staff - opened
README.md
CHANGED
|
@@ -2,8 +2,10 @@
|
|
| 2 |
license: bigscience-bloom-rail-1.0
|
| 3 |
datasets:
|
| 4 |
- pubmed
|
| 5 |
-
widget:
|
| 6 |
-
- text:
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
# Model Card for BioMedLM 2.7B
|
|
@@ -145,4 +147,4 @@ BioMedLM 2.7B is a standard GPT-2 implementation (trained with Flash Attention)
|
|
| 145 |
|
| 146 |
## Compute Infrastructure
|
| 147 |
|
| 148 |
-
The model was trained on [MosaicML Cloud](https://www.mosaicml.com/cloud), a platform designed for large workloads like LLMs. Using the [Composer](https://github.com/mosaicml/composer) training library and [PyTorch FSDP](https://pytorch.org/docs/stable/fsdp.html), it was easy to enable multi-node training across 128 A100-40GB GPUs, and the total run was completed in ~6.25 days.
|
|
|
|
| 2 |
license: bigscience-bloom-rail-1.0
|
| 3 |
datasets:
|
| 4 |
- pubmed
|
| 5 |
+
widget:
|
| 6 |
+
- text: Photosynthesis is
|
| 7 |
+
tags:
|
| 8 |
+
- medical
|
| 9 |
---
|
| 10 |
|
| 11 |
# Model Card for BioMedLM 2.7B
|
|
|
|
| 147 |
|
| 148 |
## Compute Infrastructure
|
| 149 |
|
| 150 |
+
The model was trained on [MosaicML Cloud](https://www.mosaicml.com/cloud), a platform designed for large workloads like LLMs. Using the [Composer](https://github.com/mosaicml/composer) training library and [PyTorch FSDP](https://pytorch.org/docs/stable/fsdp.html), it was easy to enable multi-node training across 128 A100-40GB GPUs, and the total run was completed in ~6.25 days.
|