sach3v's picture
Update README.md
9a71007 verified
---
language: en
tags:
- audio
- health
- cough
- tuberculosis
- hear
- medgemma
- medical
license: apache-2.0
---
# HeAR-TB Domain-Aware Dual Heads
**Domain-Aware Adaptation of Google HeAR for TB Cough Screening**
This model adapts Google's **HeAR** (Health Acoustic Representations) foundation model for **tuberculosis cough screening** using a novel domain-aware dual-head approach.
## Model Description
- **Base Model**: Google HeAR (frozen)
- **Method**: Domain-Aware Dual Heads
- One XGBoost head specialized on **Passive (natural) coughs**
- One XGBoost head specialized on **Forced (voluntary) coughs**
- Final patient score = average of both domain predictions (patient-level aggregation)
- **Embedding Weighting**: Embedding norm used as quality weight per cough
- **Dataset**: Nairobi TBscreen (Sharma et al., Science Advances 2024)
- 27,343 clean coughs (Passive + Forced)
## Performance (Patient-Level, Repeated 5-Fold CV)
| Metric | Value |
|-------------------------|------------------------|
| Patient AUC | 0.7476 ± 0.0932 |
| Sensitivity | ~0.78–0.83 (tunable) |
| Accuracy | 0.728 ± 0.060 |
## Intended Use
- Research and demonstration of acoustic TB screening using foundation models
- Part of the **MedGemma Impact Challenge** (HAI-DEF model usage)
- Educational / exploratory use on cough audio
## Limitations & Important Disclaimer
**This is NOT a medical device or diagnostic tool.**
- Trained on a small number of patients (123)
- Performance is research-level only
- Must not be used for clinical decision-making
- Requires proper clinical validation before any real-world use
- May contain biases from the training data (hospital population in Nairobi)
**Always consult a qualified healthcare professional for medical advice.**
## CITE
@misc{hear-tb-2026,
author = Sachiv.C,
title = {HeAR-TB: Domain-Aware Dual Heads for Tuberculosis Cough Screening},
year = {2026},
dataset =https://zenodo.org/records/10431329,
howpublished = (https://huggingface.co/sach3v/Domain_aware_dual_head_HEar),
note = {Entry for MedGemma Impact Challenge}
}
## How to Use
```python
import joblib
import numpy as np
import librosa
from huggingface_hub import hf_hub_download
# Download and load
package = joblib.load(hf_hub_download("sach3v/hear-tb-domain-aware-dualheads", "hear_tb_prize_domain_aware.joblib"))
model_p = package["model_p"]
scaler_p = package["scaler_p"]
model_f = package["model_f"]
scaler_f = package["scaler_f"]
def predict_cough(audio_path):
y, sr = librosa.load(audio_path, sr=16000)
# ... (pad to 2s and get HeAR embedding)
emb = get_hear_embedding(y) # your HeAR extraction function
p_passive = model_p.predict_proba(scaler_p.transform(emb))[0,1]
p_forced = model_f.predict_proba(scaler_f.transform(emb))[0,1]
return (p_passive + p_forced) / 2.0