FM-1976/gemma-2b-docjoybot-lora-F16-GGUF

This LoRA adapter was converted to GGUF format from Joy10/gemma-2b-docjoybot-lora via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details.

Model Description

🩺 A Medical Reasoning Chatbot Based on Gemma-2B + LoRA

Trained a fine-tuned version of google/gemma-2-2b-it enhanced with LoRA adapters. It specializes in medical question answering and clinical reasoning using structured, step-by-step thought processes.

πŸ“Œ Key Features

  • 🧠 Chain-of-Thought (CoT) Reasoning for complex medical queries
  • πŸ§ͺ Fine-tuned on 25,000 samples from FreedomIntelligence/medical-o1-reasoning-SFT
  • 🧬 LoRA-based parameter-efficient tuning using Hugging Face PEFT + TRL
  • πŸ’‘ Prompt template includes structured <think> tags to enhance reasoning clarity
  • ⚑ Lightweight adapter (~10MB) for efficient deployment with the base model

πŸ” Intended Use

This model is intended for educational, research, and prototyping purposes in the healthcare and AI domains. It performs best on medical diagnostic and reasoning tasks where step-by-step logical thinking is required.

⚠️ Disclaimer: This model is not intended for real-world clinical use without expert validation. It is a research-grade assistant only.

πŸ—οΈ How It Was Trained

  • Base Model: google/gemma-2-2b-it
  • LoRA Config: r=8, alpha=16, dropout=0.05
  • Frameworks: transformers, PEFT, TRL (SFTTrainer)
  • Quantization: 4-bit nf4 for efficient inference using bitsandbytes
  • Hardware: Trained on Kaggle GPU (T4), optimized for low-resource fine-tuning

πŸ’¬ Prompt Format

You are a helpful and knowledgeable AI medical assistant.

### Question:
{medical_question_here}

### Response:
<think>
{step-by-step_reasoning}
</think>
{final_answer}

Use with llama.cpp

# with cli
llama-cli -m base_model.gguf --lora gemma-2b-docjoybot-lora-f16.gguf (...other args)

# with server
llama-server -m base_model.gguf --lora gemma-2b-docjoybot-lora-f16.gguf (...other args)

To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

Downloads last month
10
GGUF
Model size
10.4M params
Architecture
gemma2
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for FM-1976/gemma-2b-docjoybot-lora-F16-GGUF

Base model

google/gemma-2-2b
Quantized
(1)
this model

Dataset used to train FM-1976/gemma-2b-docjoybot-lora-F16-GGUF