Instructions to use prithivMLmods/Qwen-UMLS-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/Qwen-UMLS-7B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prithivMLmods/Qwen-UMLS-7B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Qwen-UMLS-7B-Instruct")
model = AutoModelForCausalLM.from_pretrained("prithivMLmods/Qwen-UMLS-7B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prithivMLmods/Qwen-UMLS-7B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/Qwen-UMLS-7B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen-UMLS-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/Qwen-UMLS-7B-Instruct

SGLang

How to use prithivMLmods/Qwen-UMLS-7B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/Qwen-UMLS-7B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen-UMLS-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/Qwen-UMLS-7B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen-UMLS-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/Qwen-UMLS-7B-Instruct with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/Qwen-UMLS-7B-Instruct
```

Improve language tag

by lbourdois - opened Apr 28, 2025

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+133

-121

Files changed (1) hide show

README.md +133 -121

README.md CHANGED Viewed

@@ -1,122 +1,134 @@
----
-license: creativeml-openrail-m
-datasets:
-- avaliev/umls
-language:
-- en
-base_model:
-- Qwen/Qwen2.5-7B-Instruct
-pipeline_tag: text-generation
-library_name: transformers
-tags:
-- safetensors
-- Unified Medical Language System
-- Qwen2.5
-- 7B
-- Instruct
-- Medical
-- text-generation-inference
-- National Library of Medicine
-- umls
----
-### Qwen-UMLS-7B-Instruct `[ Unified Medical Language System ]`
-The **Qwen-UMLS-7B-Instruct** model is a specialized, instruction-tuned language model designed for medical and healthcare-related tasks. It is fine-tuned on the **Qwen2.5-7B-Instruct** base model using the **UMLS (Unified Medical Language System)** dataset, making it an invaluable tool for medical professionals, researchers, and developers building healthcare applications.
-| **File Name**                          | **Size**       | **Description**                                  | **Upload Status**  |
-|-----------------------------------------|----------------|-------------------------------------------------|--------------------|
-| `.gitattributes`                        | 1.57 kB        | File to specify LFS rules for large file tracking. | Uploaded           |
-| `README.md`                             | 323 Bytes      | Basic project information file.                 | Updated            |
-| `added_tokens.json`                     | 657 Bytes      | Contains additional tokens for the tokenizer.   | Uploaded           |
-| `config.json`                           | 860 Bytes      | Configuration file for the model.               | Uploaded           |
-| `generation_config.json`                | 281 Bytes      | Configuration file for generation settings.     | Uploaded           |
-| `merges.txt`                            | 1.82 MB        | Byte-pair encoding merge rules for tokenization.| Uploaded           |
-| `pytorch_model-00001-of-00004.bin`      | 4.88 GB        | First part of the model's PyTorch checkpoint.   | Uploaded (LFS)     |
-| `pytorch_model-00002-of-00004.bin`      | 4.93 GB        | Second part of the model's PyTorch checkpoint.  | Uploaded (LFS)     |
-| `pytorch_model-00003-of-00004.bin`      | 4.33 GB        | Third part of the model's PyTorch checkpoint.   | Uploaded (LFS)     |
-| `pytorch_model-00004-of-00004.bin`      | 1.09 GB        | Fourth part of the model's PyTorch checkpoint.  | Uploaded (LFS)     |
-| `pytorch_model.bin.index.json`          | 28.1 kB        | Index file mapping layers to checkpoint shards. | Uploaded           |
-| `special_tokens_map.json`               | 644 Bytes      | Maps special tokens like `[CLS]`, `[SEP]`, etc. | Uploaded           |
-| `tokenizer.json`                        | 11.4 MB        | Tokenizer definition and configuration.         | Uploaded (LFS)     |
-| `tokenizer_config.json`                 | 7.73 kB        | Configuration file for the tokenizer.           | Uploaded           |
-| `vocab.json`                            | 2.78 MB        | Vocabulary file for tokenization.               | Uploaded           |
-### **Key Features:**
-1. **Medical Expertise:**
-   - Trained on the UMLS dataset, ensuring deep domain knowledge in medical terminology, diagnostics, and treatment plans.
-2. **Instruction-Following:**
-   - Designed to handle complex queries with clarity and precision, suitable for diagnostic support, patient education, and research.
-3. **High-Parameter Model:**
-   - Leverages 7 billion parameters to deliver detailed, contextually accurate responses.
----
-### **Training Details:**
-- **Base Model:** [Qwen2.5-7B-Instruct](#)
-- **Dataset:** [avaliev/UMLS](#)
-  - Comprehensive dataset of medical terminologies, relationships, and use cases with 99.1k samples.
----
-### **Capabilities:**
-1. **Clinical Text Analysis:**
-   - Interpret medical notes, prescriptions, and research articles.
-2. **Question-Answering:**
-   - Answer medical queries, provide explanations for symptoms, and suggest treatments based on user prompts.
-3. **Educational Support:**
-   - Assist in learning medical terminologies and understanding complex concepts.
-4. **Healthcare Applications:**
-   - Integrate into clinical decision-support systems or patient care applications.
----
-### **Usage Instructions:**
-1. **Setup:**
-   Download all files and ensure compatibility with the Hugging Face Transformers library.
-2. **Loading the Model:**
-   ```python
-   from transformers import AutoModelForCausalLM, AutoTokenizer
-   model_name = "prithivMLmods/Qwen-UMLS-7B-Instruct"
-   tokenizer = AutoTokenizer.from_pretrained(model_name)
-   model = AutoModelForCausalLM.from_pretrained(model_name)
-   ```
-3. **Generate Medical Text:**
-   ```python
-   input_text = "What are the symptoms and treatments for diabetes?"
-   inputs = tokenizer(input_text, return_tensors="pt")
-   outputs = model.generate(**inputs, max_length=200, temperature=0.7)
-   print(tokenizer.decode(outputs[0], skip_special_tokens=True))
-   ```
-4. **Customizing Outputs:**
-   Modify `generation_config.json` to optimize output style:
-   - `temperature` for creativity vs. determinism.
-   - `max_length` for concise or extended responses.
----
-### **Applications:**
-1. **Clinical Support:**
-   - Assist healthcare providers with quick, accurate information retrieval.
-2. **Patient Education:**
-   - Provide patients with understandable explanations of medical conditions.
-3. **Medical Research:**
-   - Summarize or analyze complex medical research papers.
-4. **AI-Driven Diagnostics:**
-   - Integrate with diagnostic systems for preliminary assessments.
 ---

+---
+license: creativeml-openrail-m
+datasets:
+- avaliev/umls
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+base_model:
+- Qwen/Qwen2.5-7B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- safetensors
+- Unified Medical Language System
+- Qwen2.5
+- 7B
+- Instruct
+- Medical
+- text-generation-inference
+- National Library of Medicine
+- umls
+---
+### Qwen-UMLS-7B-Instruct `[ Unified Medical Language System ]`
+The **Qwen-UMLS-7B-Instruct** model is a specialized, instruction-tuned language model designed for medical and healthcare-related tasks. It is fine-tuned on the **Qwen2.5-7B-Instruct** base model using the **UMLS (Unified Medical Language System)** dataset, making it an invaluable tool for medical professionals, researchers, and developers building healthcare applications.
+| **File Name**                          | **Size**       | **Description**                                  | **Upload Status**  |
+|-----------------------------------------|----------------|-------------------------------------------------|--------------------|
+| `.gitattributes`                        | 1.57 kB        | File to specify LFS rules for large file tracking. | Uploaded           |
+| `README.md`                             | 323 Bytes      | Basic project information file.                 | Updated            |
+| `added_tokens.json`                     | 657 Bytes      | Contains additional tokens for the tokenizer.   | Uploaded           |
+| `config.json`                           | 860 Bytes      | Configuration file for the model.               | Uploaded           |
+| `generation_config.json`                | 281 Bytes      | Configuration file for generation settings.     | Uploaded           |
+| `merges.txt`                            | 1.82 MB        | Byte-pair encoding merge rules for tokenization.| Uploaded           |
+| `pytorch_model-00001-of-00004.bin`      | 4.88 GB        | First part of the model's PyTorch checkpoint.   | Uploaded (LFS)     |
+| `pytorch_model-00002-of-00004.bin`      | 4.93 GB        | Second part of the model's PyTorch checkpoint.  | Uploaded (LFS)     |
+| `pytorch_model-00003-of-00004.bin`      | 4.33 GB        | Third part of the model's PyTorch checkpoint.   | Uploaded (LFS)     |
+| `pytorch_model-00004-of-00004.bin`      | 1.09 GB        | Fourth part of the model's PyTorch checkpoint.  | Uploaded (LFS)     |
+| `pytorch_model.bin.index.json`          | 28.1 kB        | Index file mapping layers to checkpoint shards. | Uploaded           |
+| `special_tokens_map.json`               | 644 Bytes      | Maps special tokens like `[CLS]`, `[SEP]`, etc. | Uploaded           |
+| `tokenizer.json`                        | 11.4 MB        | Tokenizer definition and configuration.         | Uploaded (LFS)     |
+| `tokenizer_config.json`                 | 7.73 kB        | Configuration file for the tokenizer.           | Uploaded           |
+| `vocab.json`                            | 2.78 MB        | Vocabulary file for tokenization.               | Uploaded           |
+### **Key Features:**
+1. **Medical Expertise:**
+   - Trained on the UMLS dataset, ensuring deep domain knowledge in medical terminology, diagnostics, and treatment plans.
+2. **Instruction-Following:**
+   - Designed to handle complex queries with clarity and precision, suitable for diagnostic support, patient education, and research.
+3. **High-Parameter Model:**
+   - Leverages 7 billion parameters to deliver detailed, contextually accurate responses.
+---
+### **Training Details:**
+- **Base Model:** [Qwen2.5-7B-Instruct](#)
+- **Dataset:** [avaliev/UMLS](#)
+  - Comprehensive dataset of medical terminologies, relationships, and use cases with 99.1k samples.
+---
+### **Capabilities:**
+1. **Clinical Text Analysis:**
+   - Interpret medical notes, prescriptions, and research articles.
+2. **Question-Answering:**
+   - Answer medical queries, provide explanations for symptoms, and suggest treatments based on user prompts.
+3. **Educational Support:**
+   - Assist in learning medical terminologies and understanding complex concepts.
+4. **Healthcare Applications:**
+   - Integrate into clinical decision-support systems or patient care applications.
+---
+### **Usage Instructions:**
+1. **Setup:**
+   Download all files and ensure compatibility with the Hugging Face Transformers library.
+2. **Loading the Model:**
+   ```python
+   from transformers import AutoModelForCausalLM, AutoTokenizer
+   model_name = "prithivMLmods/Qwen-UMLS-7B-Instruct"
+   tokenizer = AutoTokenizer.from_pretrained(model_name)
+   model = AutoModelForCausalLM.from_pretrained(model_name)
+   ```
+3. **Generate Medical Text:**
+   ```python
+   input_text = "What are the symptoms and treatments for diabetes?"
+   inputs = tokenizer(input_text, return_tensors="pt")
+   outputs = model.generate(**inputs, max_length=200, temperature=0.7)
+   print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+   ```
+4. **Customizing Outputs:**
+   Modify `generation_config.json` to optimize output style:
+   - `temperature` for creativity vs. determinism.
+   - `max_length` for concise or extended responses.
+---
+### **Applications:**
+1. **Clinical Support:**
+   - Assist healthcare providers with quick, accurate information retrieval.
+2. **Patient Education:**
+   - Provide patients with understandable explanations of medical conditions.
+3. **Medical Research:**
+   - Summarize or analyze complex medical research papers.
+4. **AI-Driven Diagnostics:**
+   - Integrate with diagnostic systems for preliminary assessments.
 ---