Instructions to use Ra-Is/medical-gen-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Ra-Is/medical-gen-small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Ra-Is/medical-gen-small")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Ra-Is/medical-gen-small") model = AutoModelForSeq2SeqLM.from_pretrained("Ra-Is/medical-gen-small") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Ra-Is/medical-gen-small with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Ra-Is/medical-gen-small" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ra-Is/medical-gen-small", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Ra-Is/medical-gen-small
- SGLang
How to use Ra-Is/medical-gen-small with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Ra-Is/medical-gen-small" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ra-Is/medical-gen-small", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Ra-Is/medical-gen-small" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Ra-Is/medical-gen-small", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Ra-Is/medical-gen-small with Docker Model Runner:
docker model run hf.co/Ra-Is/medical-gen-small
Medical Generation Model
Overview
This repository contains a fine-tuned T5 model designed to generate medical diagnoses and treatment recommendations. The model was trained on clinical scenarios to provide accurate and contextually relevant medical outputs based on input prompts.
Model Details
- Model Type: T5
- Tokenizer: T5 tokenizer
- Training Data: Clinical scenarios and medical texts
Installation
To use this model, install the required libraries with pip:
pip install transformers
pip install tensorflow
# Load the fine-tuned model and tokenizer
from transformers import T5Tokenizer, TFT5ForConditionalGeneration
model_id = "Ra-Is/medical-gen-small-CoT"
model = TFT5ForConditionalGeneration.from_pretrained(model_id)
tokenizer = T5Tokenizer.from_pretrained(model_id)
# Prepare a sample input prompt
input_prompt = ("A 35-year-old female presents with a 2-week history of "
"persistent cough, shortness of breath, and fatigue. She has "
"a history of asthma and has recently been exposed to a sick "
"family member with a respiratory infection. Chest X-ray shows "
"bilateral infiltrates. What is the likely diagnosis, and what "
"should be the treatment?")
# Tokenize the input
input_ids = tokenizer(input_prompt, return_tensors="tf").input_ids
# Generate the output (diagnosis)
outputs = model.generate(
input_ids,
max_length=512,
num_beams=5,
temperature=1,
top_k=50,
top_p=0.9,
do_sample=True, # Enable sampling
early_stopping=True
)
# Decode and print the output
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
- Downloads last month
- -