Instructions to use SNOWTEAM/DoctorLLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SNOWTEAM/DoctorLLM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SNOWTEAM/DoctorLLM")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SNOWTEAM/DoctorLLM") model = AutoModelForCausalLM.from_pretrained("SNOWTEAM/DoctorLLM") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use SNOWTEAM/DoctorLLM with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SNOWTEAM/DoctorLLM" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SNOWTEAM/DoctorLLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/SNOWTEAM/DoctorLLM
- SGLang
How to use SNOWTEAM/DoctorLLM with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SNOWTEAM/DoctorLLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SNOWTEAM/DoctorLLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SNOWTEAM/DoctorLLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SNOWTEAM/DoctorLLM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use SNOWTEAM/DoctorLLM with Docker Model Runner:
docker model run hf.co/SNOWTEAM/DoctorLLM
Update README.md
Browse files
README.md
CHANGED
|
@@ -42,7 +42,7 @@ Using open source instruction tuning datasets are composed of three main parts:
|
|
| 42 |
By combining the above three parts, we form a large-scale, high-quality, medical-specific instruction tuning dataset, consisting of 202M tokens. We further tune Medico-mistral on this dataset, resulting in sft_medico-mistral.
|
| 43 |
|
| 44 |
## Training Details
|
| 45 |
-
|
| 46 |
### Training Data
|
| 47 |
|
| 48 |
The training data combines diverse datasets from medical consultations, rationale QA, and knowledge graphs to ensure comprehensive medical knowledge coverage and reasoning ability.
|
|
|
|
| 42 |
By combining the above three parts, we form a large-scale, high-quality, medical-specific instruction tuning dataset, consisting of 202M tokens. We further tune Medico-mistral on this dataset, resulting in sft_medico-mistral.
|
| 43 |
|
| 44 |
## Training Details
|
| 45 |
+
Our model is based on Mixtral-8x7B-v0.1-Instruct, a generic English LLM with 13 billion parameters. Training was performed on 8 A100-80G GPUs via parallelization. We first inject knowledge into the base model Mistral to optimize the autoregressive loss. During training, we set the maximum context length to 4096 and the batch size to 1024. the model was trained using the AdamW optimizer (Loshchilov and Hutter, 2017) with a learning rate of 2e-5. we employed a fully-sliced data parallel (FSDP) acceleration strategy, the bf16 (brain floating-point) data format, and gradient checkpoints ( Chen et al. 2016). The model was trained using 8 A100 GPUs for 1 epoch of knowledge injection. Afterwards, we used 7 A100 GPUs to perform 5 epochs of healthcare-specific instruction tuning in the SFT phase with a batch size of 896 . During the instruction tuning phase, all sequences are processed in each epoch.
|
| 46 |
### Training Data
|
| 47 |
|
| 48 |
The training data combines diverse datasets from medical consultations, rationale QA, and knowledge graphs to ensure comprehensive medical knowledge coverage and reasoning ability.
|