PHC Multi-Label Hierarchical Classifier (v4)
Fine-tuned XLM-RoBERTa-base on Arabic healthcare patient complaints. Predicts a 4-level PHC taxonomy code (Multi-Label) with confidence at each level.
Output heads are sized to the full PHC taxonomy (117 codes). Of these, 90 have training examples and 27 are zero-shot from the taxonomy structure only.
Taxonomy Structure
PHC -> L2 (service_area, 7) -> L3 (category, 21+) -> L4 (001/002/003)
| L2 — Service Area | EMD, IPS, LAB, OPC, PHA, RAD, REC | | L3 — Category | ALT, APN, CDR, COM, DAV, DIC, EMS, ENV, EPS, EQU, FAC, HSK, INS, MAC, MAS, MBR, PCC, PED, PPD, PRE, QOI, QUE, REG, SAF, SCH, SRT, SYS, TRA, TRI, TRT, TTI, VOI, WAI | | Full Code | 117 taxonomy codes |
Test Performance
| Level | Exact Match Acc | F1 (macro) |
|---|---|---|
| L2 | 97.5% | 0.9762 |
| L3 | 92.0% | 0.5487 |
| L4 | 95.0% | 0.6097 |
| Full Code | 88.7% | 0.2483 |
Best val full-code exact match accuracy during training: 90.4%
Model tree for perfectPresentation/xlmr-phc-job
Base model
FacebookAI/xlm-roberta-base