MedCounsel / README.md

ShasshankJivi

Updated Usage

add3aa2 verified 30 days ago

preview code

raw

history blame contribute delete

15.2 kB

metadata

license: apache-2.0
base_model: openai/gpt-oss-20b
tags:
  - medical
  - healthcare
  - clinical
  - text-generation
  - conversational
  - gpt-oss
  - lora
  - sft
language:
  - en
  - es
  - fr
  - de
  - zh
  - ja
pipeline_tag: text-generation
library_name: transformers

Jivi-MedCounsel: Advanced Medical Language Model

Model Overview

Jivi-MedCounsel is a state-of-the-art medical language model built on the GPT-OSS-20B architecture and fine-tuned by Jivi AI for healthcare applications. This model has been specifically optimized for OpenAI's HealthBench evaluations, achieving a cumulative score of 0.63 and surpassing the base model by over 48%.

Jivi-MedCounsel is designed to serve as an intelligent medical assistant that provides safe, accurate, and context-aware health guidance — clarifying symptoms, identifying red flags, and offering evidence-based next steps with empathy, without replacing professional medical care.

🎯 Purpose-Built for Healthcare

Jivi-MedCounsel excels at:

Clinical Reasoning: Analyzing patient symptoms and medical histories with accuracy
Safety-First Approach: Identifying red flags and directing users to emergency care when needed
Evidence-Based Guidance: Providing recommendations grounded in medical consensus and guidelines
Empathetic Communication: Delivering health information with clarity and compassion
Context-Aware Responses: Adapting advice based on patient demographics, comorbidities, and resource availability

📊 HealthBench Performance

Understanding HealthBench Framework

HealthBench is OpenAI's comprehensive healthcare AI evaluation framework that assesses models across seven critical themes and five evaluation axes. Each theme represents real-world medical scenarios that AI systems encounter in healthcare settings.

The Seven HealthBench Themes:

Response Under Uncertainty - How well the model expresses caution and manages ambiguity when medical evidence is limited
Context Seeking - The model's ability to identify missing information and request essential details for accurate responses
Health Data Tasks - Accuracy and safety in handling structured health data, medical documentation, and clinical decision support
Global Health - Adaptability to diverse healthcare contexts, regional variations, and resource-constrained settings
Emergency Referrals - Recognition of urgent medical situations and appropriate guidance toward immediate care
Expertise-Tailored Communication - Adjusting communication style and terminology based on the user's medical knowledge level
Response Depth - Providing appropriate levels of detail to enable informed health decisions

The Five Evaluation Axes:

Accuracy: Factually correct and evidence-based information
Completeness: Addressing all relevant aspects including necessary follow-up actions
Communication Quality: Clear, structured, and appropriately tailored responses
Instruction Following: Adherence to specific user requirements and formatting
Context Awareness: Considering user role, resources, and seeking clarification only when necessary

Jivi-MedCounsel's Superior Performance

Overall Score: 0.630 - Achieving the highest score among leading AI models

Jivi-MedCounsel outperforms major competitors including OpenAI o3 (0.598), Grok 3 (0.543), Gemini 2.5 Pro (0.520), and GPT-4.1 (0.479), demonstrating excellence across all healthcare evaluation dimensions:

🎯 Key Performance Highlights

1. Response Under Uncertainty (Exceptional Performance)

Jivi-MedCounsel excels at expressing appropriate caution when medical evidence is ambiguous or limited
The model demonstrates superior judgment in qualifying statements, acknowledging knowledge boundaries, and recommending professional consultation when needed
This is critical for patient safety, as overconfident responses in uncertain scenarios can lead to harmful outcomes

2. Context Seeking (Industry-Leading)

Outstanding ability to identify when critical patient information is missing (medical history, symptom duration, severity indicators, etc.)
Proactively requests relevant details before providing guidance, ensuring responses are tailored to specific patient contexts
Demonstrates sophisticated understanding of which contextual factors matter most for different medical queries

3. Emergency Referrals (Consistently Strong)

Highly reliable at recognizing medical red flags and urgent warning signs
Appropriately escalates serious conditions requiring immediate medical attention
Balances reassurance with necessary urgency, avoiding both under- and over-triage

4. Health Data Tasks (Above Benchmark)

Demonstrates high accuracy in interpreting medical data, lab results, and clinical metrics
Maintains safety standards when discussing medical documentation and clinical decision support
Handles structured health information with precision and clinical relevance

5. Global Health (Strong Adaptability)

Shows awareness of healthcare resource variations across different regions
Adapts recommendations based on clinical practice variations and regional disease patterns
Considers socioeconomic factors and healthcare accessibility in guidance

6. Expertise-Tailored Communication (Exceptional)

Effectively adjusts medical terminology and explanation depth based on the user's background
Communicates complex medical concepts in accessible language for patients while maintaining clinical precision for healthcare professionals
Demonstrates empathy and clarity without oversimplifying critical health information

7. Response Depth (Well-Calibrated)

Provides comprehensive yet concise responses with appropriate detail levels
Balances thoroughness with accessibility, avoiding information overload
Includes actionable next steps and evidence-based recommendations

Why Jivi-MedCounsel Leads the Benchmark

The 48% improvement over the base GPT-OSS-20B model and superior performance compared to much larger models is attributed to:

Specialized Medical Fine-Tuning: 20,000 curated doctor-patient conversations covering diverse clinical scenarios
Safety-First Training: Emphasis on clinical reasoning, red flag identification, and appropriate escalation
Context-Aware Optimization: Training on cases requiring careful information gathering and uncertainty management
Evidence-Based Methodology: Grounding in medical consensus, clinical guidelines, and real-world healthcare workflows
Balanced Communication: Training on both patient-facing and professional medical communication styles

Jivi-MedCounsel's consistent strength across all seven HealthBench themes demonstrates a well-rounded, production-ready medical AI assistant capable of handling the complex, nuanced challenges of real-world healthcare interactions.

🔧 Training Process

Base Architecture

Built on GPT-OSS-20B, a 20-billion parameter open-source language model developed by OpenAI, designed for efficient fine-tuning and deployment.

Fine-Tuning Methodology

Jivi-MedCounsel has been refined using Supervised Fine-Tuning (SFT) with LoRA (Low-Rank Adaptation) for efficient parameter updates while preserving the base model's capabilities. This approach enables targeted improvements in medical reasoning and clinical communication without requiring full model retraining.

Optimization & Efficiency

Quantization: MXFP4 quantization using NVIDIA TensorRT Model Optimizer for efficient inference and deployment
Distributed Training: Leverages advanced optimization techniques for scalable training across multiple GPUs
Memory Optimization: Employs gradient checkpointing and mixed-precision training for optimal resource utilization

📚 Data Preparation

Jivi-MedCounsel has been trained on a carefully curated dataset of 20,000 doctor-patient conversations:

Real-World Data: 15,000 authentic clinical interactions covering diverse medical scenarios
Synthetic Data: 5,000 high-quality generated conversations to augment edge cases and rare conditions
Data Sources: Clinical consultations, symptom assessments, treatment discussions, and follow-up care
Quality Assurance: All data validated for medical accuracy and safety

The dataset encompasses:

Primary care consultations
Specialist referrals
Symptom clarification
Treatment explanations
Medication guidance
Emergency triage scenarios
Follow-up care instructions

💻 How to Use

Installation

pip install transformers torch accelerate

Basic Usage with Transformers Pipeline

import torch
from transformers import pipeline

# Initialize the text generation pipeline
model_id = "jiviai/medcounsel"

pipe = pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

# Example medical query
prompt = """Patient presents with persistent dry cough for 2 weeks, mild fever (100.5°F), 
and fatigue. No shortness of breath. What are the possible causes and next steps?"""

# Generate response
response = pipe(
    prompt,
    max_new_tokens=8192,
    do_sample=True,
    temperature=0.9,
    top_p=1,
)

print(response[0]['generated_text'])

Advanced Usage with AutoModel

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "jiviai/medcounsel"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Prepare messages
messages = [
    {"role": "system", "content": "You are an AI medical assistant that provides safe, accurate, and context-aware health guidance — clarifying symptoms, identifying red flags, and offering evidence-based next steps with empathy, without replacing professional medical care."},
    {"role": "user", "content": "What should I do if I have chest pain that radiates to my left arm?"}
]

# Apply chat template
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generate response
outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    do_sample=True,
    temperature=0.9,
    top_p=1,
    pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Requirements

transformers>=4.45.2
torch>=2.0.0
accelerate>=0.20.0

🌍 Supported Languages

Jivi-MedCounsel supports 14 languages including:

Arabic
Bengali
Chinese
English
French
German
Hindi
Indonesian
Italian
Japanese
Korean
Portuguese
Spanish
Swahili
Yoruba

Note: Performance is optimized for English-language medical queries.

🎯 Intended Use Cases

Jivi-MedCounsel is designed for:

✅ Clinical Decision Support: Assisting healthcare professionals with differential diagnoses and treatment options
✅ Patient Education: Explaining medical conditions, procedures, and treatments in accessible language
✅ Symptom Assessment: Helping users understand their symptoms and when to seek care
✅ Medical Research: Supporting literature review and medical knowledge extraction
✅ Health Chatbots: Powering conversational AI for healthcare applications
✅ Triage Support: Identifying urgent cases requiring immediate medical attention
✅ Medical Training: Educational tool for medical students and trainees

⚠️ Limitations & Disclaimer

Important Safety Notice

This model is NOT intended for:

❌ Direct clinical diagnosis without physician oversight
❌ Prescribing medications
❌ Replacing professional medical advice, diagnosis, or treatment
❌ Emergency medical situations (always call emergency services)
❌ Definitive medical decision-making

Disclaimer

The data, code, and model checkpoints are intended solely for research and educational purposes. They should NOT be used in clinical care or for any clinical decision-making purposes without appropriate medical professional oversight.

Users must:

Consult with qualified healthcare professionals for all medical concerns
Verify all medical information with licensed practitioners
Seek immediate emergency care for serious or life-threatening conditions
Understand that AI outputs may contain errors or outdated information

Model Limitations

Responses are based on training data and may not reflect the most current medical guidelines
The model may not have information on very recent medical developments
Performance may vary across different medical specialties and rare conditions
The model cannot perform physical examinations or order diagnostic tests
Cultural and regional medical practice variations may not be fully captured

📄 License

This model is released under the Apache License 2.0. See the LICENSE file for full details.

🔗 References & Resources

Base Model: GPT-OSS-20B
Model Card: GPT-OSS Model Card (PDF)
Jivi AI Website: https://jivi.ai
Hugging Face: jiviai/medcounsel

📞 Contact & Feedback

For questions, feedback, or issues with the model:

Community Discussions: Use the Hugging Face community section
Bug Reports: Please provide detailed information about the issue
Research Collaborations: Contact Jivi AI through official channels

🙏 Acknowledgments

OpenAI for developing and open-sourcing the GPT-OSS-20B base model
The Hugging Face team for their transformers library and model hosting platform
The medical community for providing invaluable domain expertise
All contributors to the healthcare AI research community

📊 Citation

If you use Jivi-MedCounsel in your research, please cite:

@misc{jiviai2025medcounsel,
  title={Jivi-MedCounsel: Advanced Medical Language Model},
  author={Jivi AI},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/jiviai/medcounsel}
}

Built with ❤️ by Jivi AI

Making healthcare accessible, accurate, and empathetic through AI