MedCounsel / README.md
ShasshankJivi's picture
Updated Usage
add3aa2 verified
metadata
license: apache-2.0
base_model: openai/gpt-oss-20b
tags:
  - medical
  - healthcare
  - clinical
  - text-generation
  - conversational
  - gpt-oss
  - lora
  - sft
language:
  - en
  - es
  - fr
  - de
  - zh
  - ja
pipeline_tag: text-generation
library_name: transformers

Jivi-MedCounsel: Advanced Medical Language Model

Jivi MedCounsel Banner

License Model Hugging Face


Model Overview

Jivi-MedCounsel is a state-of-the-art medical language model built on the GPT-OSS-20B architecture and fine-tuned by Jivi AI for healthcare applications. This model has been specifically optimized for OpenAI's HealthBench evaluations, achieving a cumulative score of 0.63 and surpassing the base model by over 48%.

Jivi-MedCounsel is designed to serve as an intelligent medical assistant that provides safe, accurate, and context-aware health guidance — clarifying symptoms, identifying red flags, and offering evidence-based next steps with empathy, without replacing professional medical care.


🎯 Purpose-Built for Healthcare

Jivi-MedCounsel excels at:

  • Clinical Reasoning: Analyzing patient symptoms and medical histories with accuracy
  • Safety-First Approach: Identifying red flags and directing users to emergency care when needed
  • Evidence-Based Guidance: Providing recommendations grounded in medical consensus and guidelines
  • Empathetic Communication: Delivering health information with clarity and compassion
  • Context-Aware Responses: Adapting advice based on patient demographics, comorbidities, and resource availability

📊 HealthBench Performance

HealthBench Performance

Understanding HealthBench Framework

HealthBench is OpenAI's comprehensive healthcare AI evaluation framework that assesses models across seven critical themes and five evaluation axes. Each theme represents real-world medical scenarios that AI systems encounter in healthcare settings.

The Seven HealthBench Themes:

  1. Response Under Uncertainty - How well the model expresses caution and manages ambiguity when medical evidence is limited
  2. Context Seeking - The model's ability to identify missing information and request essential details for accurate responses
  3. Health Data Tasks - Accuracy and safety in handling structured health data, medical documentation, and clinical decision support
  4. Global Health - Adaptability to diverse healthcare contexts, regional variations, and resource-constrained settings
  5. Emergency Referrals - Recognition of urgent medical situations and appropriate guidance toward immediate care
  6. Expertise-Tailored Communication - Adjusting communication style and terminology based on the user's medical knowledge level
  7. Response Depth - Providing appropriate levels of detail to enable informed health decisions

The Five Evaluation Axes:

  • Accuracy: Factually correct and evidence-based information
  • Completeness: Addressing all relevant aspects including necessary follow-up actions
  • Communication Quality: Clear, structured, and appropriately tailored responses
  • Instruction Following: Adherence to specific user requirements and formatting
  • Context Awareness: Considering user role, resources, and seeking clarification only when necessary

Jivi-MedCounsel's Superior Performance

Overall Score: 0.630 - Achieving the highest score among leading AI models

Jivi-MedCounsel outperforms major competitors including OpenAI o3 (0.598), Grok 3 (0.543), Gemini 2.5 Pro (0.520), and GPT-4.1 (0.479), demonstrating excellence across all healthcare evaluation dimensions:

🎯 Key Performance Highlights

1. Response Under Uncertainty (Exceptional Performance)

  • Jivi-MedCounsel excels at expressing appropriate caution when medical evidence is ambiguous or limited
  • The model demonstrates superior judgment in qualifying statements, acknowledging knowledge boundaries, and recommending professional consultation when needed
  • This is critical for patient safety, as overconfident responses in uncertain scenarios can lead to harmful outcomes

2. Context Seeking (Industry-Leading)

  • Outstanding ability to identify when critical patient information is missing (medical history, symptom duration, severity indicators, etc.)
  • Proactively requests relevant details before providing guidance, ensuring responses are tailored to specific patient contexts
  • Demonstrates sophisticated understanding of which contextual factors matter most for different medical queries

3. Emergency Referrals (Consistently Strong)

  • Highly reliable at recognizing medical red flags and urgent warning signs
  • Appropriately escalates serious conditions requiring immediate medical attention
  • Balances reassurance with necessary urgency, avoiding both under- and over-triage

4. Health Data Tasks (Above Benchmark)

  • Demonstrates high accuracy in interpreting medical data, lab results, and clinical metrics
  • Maintains safety standards when discussing medical documentation and clinical decision support
  • Handles structured health information with precision and clinical relevance

5. Global Health (Strong Adaptability)

  • Shows awareness of healthcare resource variations across different regions
  • Adapts recommendations based on clinical practice variations and regional disease patterns
  • Considers socioeconomic factors and healthcare accessibility in guidance

6. Expertise-Tailored Communication (Exceptional)

  • Effectively adjusts medical terminology and explanation depth based on the user's background
  • Communicates complex medical concepts in accessible language for patients while maintaining clinical precision for healthcare professionals
  • Demonstrates empathy and clarity without oversimplifying critical health information

7. Response Depth (Well-Calibrated)

  • Provides comprehensive yet concise responses with appropriate detail levels
  • Balances thoroughness with accessibility, avoiding information overload
  • Includes actionable next steps and evidence-based recommendations

Why Jivi-MedCounsel Leads the Benchmark

The 48% improvement over the base GPT-OSS-20B model and superior performance compared to much larger models is attributed to:

  1. Specialized Medical Fine-Tuning: 20,000 curated doctor-patient conversations covering diverse clinical scenarios
  2. Safety-First Training: Emphasis on clinical reasoning, red flag identification, and appropriate escalation
  3. Context-Aware Optimization: Training on cases requiring careful information gathering and uncertainty management
  4. Evidence-Based Methodology: Grounding in medical consensus, clinical guidelines, and real-world healthcare workflows
  5. Balanced Communication: Training on both patient-facing and professional medical communication styles

Jivi-MedCounsel's consistent strength across all seven HealthBench themes demonstrates a well-rounded, production-ready medical AI assistant capable of handling the complex, nuanced challenges of real-world healthcare interactions.


🔧 Training Process

Base Architecture

Built on GPT-OSS-20B, a 20-billion parameter open-source language model developed by OpenAI, designed for efficient fine-tuning and deployment.

Fine-Tuning Methodology

Jivi-MedCounsel has been refined using Supervised Fine-Tuning (SFT) with LoRA (Low-Rank Adaptation) for efficient parameter updates while preserving the base model's capabilities. This approach enables targeted improvements in medical reasoning and clinical communication without requiring full model retraining.

Optimization & Efficiency

  • Quantization: MXFP4 quantization using NVIDIA TensorRT Model Optimizer for efficient inference and deployment
  • Distributed Training: Leverages advanced optimization techniques for scalable training across multiple GPUs
  • Memory Optimization: Employs gradient checkpointing and mixed-precision training for optimal resource utilization

📚 Data Preparation

Jivi-MedCounsel has been trained on a carefully curated dataset of 20,000 doctor-patient conversations:

  • Real-World Data: 15,000 authentic clinical interactions covering diverse medical scenarios
  • Synthetic Data: 5,000 high-quality generated conversations to augment edge cases and rare conditions
  • Data Sources: Clinical consultations, symptom assessments, treatment discussions, and follow-up care
  • Quality Assurance: All data validated for medical accuracy and safety

The dataset encompasses:

  • Primary care consultations
  • Specialist referrals
  • Symptom clarification
  • Treatment explanations
  • Medication guidance
  • Emergency triage scenarios
  • Follow-up care instructions

💻 How to Use

Installation

pip install transformers torch accelerate

Basic Usage with Transformers Pipeline

import torch
from transformers import pipeline

# Initialize the text generation pipeline
model_id = "jiviai/medcounsel"

pipe = pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

# Example medical query
prompt = """Patient presents with persistent dry cough for 2 weeks, mild fever (100.5°F), 
and fatigue. No shortness of breath. What are the possible causes and next steps?"""

# Generate response
response = pipe(
    prompt,
    max_new_tokens=8192,
    do_sample=True,
    temperature=0.9,
    top_p=1,
)

print(response[0]['generated_text'])

Advanced Usage with AutoModel

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "jiviai/medcounsel"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Prepare messages
messages = [
    {"role": "system", "content": "You are an AI medical assistant that provides safe, accurate, and context-aware health guidance — clarifying symptoms, identifying red flags, and offering evidence-based next steps with empathy, without replacing professional medical care."},
    {"role": "user", "content": "What should I do if I have chest pain that radiates to my left arm?"}
]

# Apply chat template
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generate response
outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    do_sample=True,
    temperature=0.9,
    top_p=1,
    pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Requirements

transformers>=4.45.2
torch>=2.0.0
accelerate>=0.20.0

🌍 Supported Languages

Jivi-MedCounsel supports 14 languages including:

  • Arabic
  • Bengali
  • Chinese
  • English
  • French
  • German
  • Hindi
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Portuguese
  • Spanish
  • Swahili
  • Yoruba

Note: Performance is optimized for English-language medical queries.


🎯 Intended Use Cases

Jivi-MedCounsel is designed for:

Clinical Decision Support: Assisting healthcare professionals with differential diagnoses and treatment options
Patient Education: Explaining medical conditions, procedures, and treatments in accessible language
Symptom Assessment: Helping users understand their symptoms and when to seek care
Medical Research: Supporting literature review and medical knowledge extraction
Health Chatbots: Powering conversational AI for healthcare applications
Triage Support: Identifying urgent cases requiring immediate medical attention
Medical Training: Educational tool for medical students and trainees


⚠️ Limitations & Disclaimer

Important Safety Notice

This model is NOT intended for:

  • ❌ Direct clinical diagnosis without physician oversight
  • ❌ Prescribing medications
  • ❌ Replacing professional medical advice, diagnosis, or treatment
  • ❌ Emergency medical situations (always call emergency services)
  • ❌ Definitive medical decision-making

Disclaimer

The data, code, and model checkpoints are intended solely for research and educational purposes. They should NOT be used in clinical care or for any clinical decision-making purposes without appropriate medical professional oversight.

Users must:

  • Consult with qualified healthcare professionals for all medical concerns
  • Verify all medical information with licensed practitioners
  • Seek immediate emergency care for serious or life-threatening conditions
  • Understand that AI outputs may contain errors or outdated information

Model Limitations

  • Responses are based on training data and may not reflect the most current medical guidelines
  • The model may not have information on very recent medical developments
  • Performance may vary across different medical specialties and rare conditions
  • The model cannot perform physical examinations or order diagnostic tests
  • Cultural and regional medical practice variations may not be fully captured

📄 License

This model is released under the Apache License 2.0. See the LICENSE file for full details.


🔗 References & Resources


📞 Contact & Feedback

For questions, feedback, or issues with the model:

  • Community Discussions: Use the Hugging Face community section
  • Bug Reports: Please provide detailed information about the issue
  • Research Collaborations: Contact Jivi AI through official channels

🙏 Acknowledgments

  • OpenAI for developing and open-sourcing the GPT-OSS-20B base model
  • The Hugging Face team for their transformers library and model hosting platform
  • The medical community for providing invaluable domain expertise
  • All contributors to the healthcare AI research community

📊 Citation

If you use Jivi-MedCounsel in your research, please cite:

@misc{jiviai2025medcounsel,
  title={Jivi-MedCounsel: Advanced Medical Language Model},
  author={Jivi AI},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/jiviai/medcounsel}
}

Built with ❤️ by Jivi AI

Making healthcare accessible, accurate, and empathetic through AI