Instructions to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT", filename="unsloth.Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M # Run inference directly in the terminal: llama-cli -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M # Run inference directly in the terminal: llama-cli -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
Use Docker
docker model run hf.co/SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
- Ollama
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with Ollama:
ollama run hf.co/SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
- Unsloth Studio new
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT to start chatting
- Docker Model Runner
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with Docker Model Runner:
docker model run hf.co/SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
- Lemonade
How to use SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT:Q4_K_M
Run and chat with the model
lemonade run user.Deep-seek-R1-Medical-reasoning-SFT-Q4_K_M
List all available models
lemonade list
DeepSeek-R1-Distill-Llama-8B - Fine-Tuned for Medical Chain-of-Thought Reasoning
Model Overview
The DeepSeek-R1-Distill-Llama-8B model has been fine-tuned for medical chain-of-thought (CoT) reasoning. This fine-tuning process enhances the model's ability to generate structured, concise, and accurate medical reasoning outputs. The model was trained using a 500-sample subset of the medical-o1-reasoning-SFT dataset, with optimizations including 4-bit quantization and LoRA adapters to improve efficiency and reduce memory usage.
Key Features
- Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B
- Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
- Training Dataset: 500 samples from medical-o1-reasoning-SFT dataset.
- Tools Used:
- Unsloth: Accelerates training by 2x.
- 4-bit Quantization: Reduces model memory usage.
- LoRA Adapters: Enables parameter-efficient fine-tuning.
- Training Time: 44 minutes.
Performance Improvements
- Response Length: Reduced from an average of 450 words to 150 words, improving conciseness.
- Reasoning Style: Shifted from verbose explanations to more focused, structured reasoning.
- Answer Format: Transitioned from bulleted lists to paragraph-style answers for clarity.
Intended Use
This model is designed for use by:
- Medical professionals requiring structured diagnostic reasoning.
- Researchers seeking assistance in medical knowledge extraction.
- Developers integrating the model for medical CoT tasks in clinical settings, treatment planning, and education.
Typical use cases include:
- Clinical diagnostics
- Treatment planning
- Medical education and training
- Research assistance
Training Details
Key Components:
- Model: unsloth/DeepSeek-R1-Distill-Llama-8B
- Dataset: medical-o1-reasoning-SFT (500 samples)
- Training Tools:
- Unsloth: Optimized training for faster results (2x speedup).
- 4-bit Quantization: Optimized memory usage for efficient training.
- LoRA Adapters: Enables lightweight fine-tuning with reduced computational costs.
Fine-Tuning Process:
Install Required Packages: Installed necessary libraries, including unsloth and kaggle.
Authentication: Authenticated with Hugging Face Hub and Weights & Biases for tracking experiments and versioning.
Model Initialization: Initialized the base model with 4-bit quantization and a sequence length of up to 2048 tokens.
Pre-Fine-Tuning Inference: Conducted an initial inference to establish the model’s baseline performance on a medical question.
Dataset Preparation: Structured and formatted the training data using a custom template tailored to medical CoT reasoning tasks.
Application of LoRA Adapters: Incorporated LoRA adapters for efficient parameter tuning during fine-tuning.
Supervised Fine-Tuning: Utilized SFTTrainer to fine-tune the model with optimized hyperparameters for 44 minutes.
Post-Fine-Tuning Inference: Evaluated the model’s improved performance by testing it on the same medical question after fine-tuning.
Saving and Loading: Stored the fine-tuned model, including LoRA adapters, for easy future use and deployment.
Model Deployment: Pushed the fine-tuned model to Hugging Face Hub in GGUF format with 4-bit quantization enabled for efficient use.
Notebook
Access the implementation notebook for this modelhere. This notebook provides detailed steps for fine-tuning and deploying the model.
- Downloads last month
- 195
4-bit
5-bit
8-bit
Model tree for SURESHBEEKHANI/Deep-seek-R1-Medical-reasoning-SFT
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B