perfectPresentation
/

2p-phc-classifier

Text Classification

Model card Files Files and versions

2p-phc-classifier / README.md

mohamedabdelmotagally's picture

mohamedabdelmotagally

Upload folder using huggingface_hub

659b2f3 verified 1 day ago

|

history blame contribute delete

1.25 kB

	---
	language: ar
	tags:
	- text-classification
	- xlm-roberta
	- arabic
	- healthcare
	- hierarchical
	- multi-label
	base_model: FacebookAI/xlm-roberta-base
	datasets:
	- perfectPresentation/phc-dataset
	---

	# PHC Multi-Label Hierarchical Classifier (v4)

	Fine-tuned XLM-RoBERTa-base on Arabic healthcare patient complaints.
	Predicts a 4-level PHC taxonomy code (Multi-Label) with confidence at each level.

	> Output heads are sized to the full PHC taxonomy (117 codes). Of these, 90 have training examples and 27 are zero-shot from the taxonomy structure only.

	## Taxonomy Structure

	```
	PHC -> L2 (service_area, 7) -> L3 (category, 21+) -> L4 (001/002/003)
	```

	\| L2 — Service Area \| EMD, IPS, LAB, OPC, PHA, RAD, REC \|
	\| L3 — Category \| ALT, APN, CDR, COM, DAV, DIC, EMS, ENV, EPS, EQU, FAC, HSK, INS, MAC, MAS, MBR, PCC, PED, PPD, PRE, QOI, QUE, REG, SAF, SCH, SRT, SYS, TRA, TRI, TRT, TTI, VOI, WAI \|
	\| Full Code \| 117 taxonomy codes \|

	## Test Performance

	\| Level \| Exact Match Acc \| F1 (macro) \|
	\|-------\|-----------------\|------------\|
	\| L2 \| 96.0% \| 0.9779 \|
	\| L3 \| 87.2% \| 0.5307 \|
	\| L4 \| 93.6% \| 0.6026 \|
	\| Full Code \| 82.7% \| 0.2438 \|

	> Best val full-code exact match accuracy during training: 85.2%