FM-1976/gemma-2b-docjoybot-lora-F16-GGUF

This LoRA adapter was converted to GGUF format from Joy10/gemma-2b-docjoybot-lora via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details.

Model Description

🩺 A Medical Reasoning Chatbot Based on Gemma-2B + LoRA

Trained a fine-tuned version of google/gemma-2-2b-it enhanced with LoRA adapters. It specializes in medical question answering and clinical reasoning using structured, step-by-step thought processes.

📌 Key Features

🧠 Chain-of-Thought (CoT) Reasoning for complex medical queries
🧪 Fine-tuned on 25,000 samples from FreedomIntelligence/medical-o1-reasoning-SFT
🧬 LoRA-based parameter-efficient tuning using Hugging Face PEFT + TRL
💡 Prompt template includes structured <think> tags to enhance reasoning clarity
⚡ Lightweight adapter (~10MB) for efficient deployment with the base model

🔍 Intended Use

This model is intended for educational, research, and prototyping purposes in the healthcare and AI domains. It performs best on medical diagnostic and reasoning tasks where step-by-step logical thinking is required.

⚠️ Disclaimer: This model is not intended for real-world clinical use without expert validation. It is a research-grade assistant only.

🏗️ How It Was Trained

Base Model: google/gemma-2-2b-it
LoRA Config: r=8, alpha=16, dropout=0.05
Frameworks: transformers, PEFT, TRL (SFTTrainer)
Quantization: 4-bit nf4 for efficient inference using bitsandbytes
Hardware: Trained on Kaggle GPU (T4), optimized for low-resource fine-tuning

💬 Prompt Format

You are a helpful and knowledgeable AI medical assistant.

### Question:
{medical_question_here}

### Response:
<think>
{step-by-step_reasoning}
</think>
{final_answer}

Use with llama.cpp

# with cli
llama-cli -m base_model.gguf --lora gemma-2b-docjoybot-lora-f16.gguf (...other args)

# with server
llama-server -m base_model.gguf --lora gemma-2b-docjoybot-lora-f16.gguf (...other args)

To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

Downloads last month: 10

GGUF

Model size

10.4M params

Architecture

gemma2

Hardware compatibility

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FM-1976/gemma-2b-docjoybot-lora-F16-GGUF

Base model

google/gemma-2-2b

Finetuned

google/gemma-2-2b-it

Finetuned

Joy10/gemma-2b-docjoybot-lora

Quantized

(1)

this model

FM-1976
/

gemma-2b-docjoybot-lora-F16-GGUF