icf-domains / README.md

update README.md

ae60ce5 over 4 years ago

4.19 kB

	---
	language: nl
	license: mit
	pipeline_tag: text-classification
	inference: false
	---

	# A-PROOF ICF-domains Classification

	## Description

	A fine-tuned multi-label classification model that detects 9 [WHO-ICF](https://www.who.int/standards/classifications/international-classification-of-functioning-disability-and-health) domains in clinical text in Dutch. The model is based on a pre-trained Dutch medical language model ([link to be added]()), a RoBERTa model, trained from scratch on clinical notes of the Amsterdam UMC.

	## ICF domains
	The model can detect 9 domains, which were chosen due to their relevance to recovery from COVID-19:

	ICF code \| Domain \| name in repo
	---\|---\|---
	b440 \| Respiration functions \| ADM
	b140 \| Attention functions \| ATT
	d840-d859 \| Work and employment \| BER
	b1300 \| Energy level \| ENR
	d550 \| Eating \| ETN
	d450 \| Walking \| FAC
	b455 \| Exercise tolerance functions \| INS
	b530 \| Weight maintenance functions \| MBW
	b152 \| Emotional functions \| STM

	## Intended uses and limitations
	- The model was fine-tuned (trained, validated and tested) on medical records from the Amsterdam UMC (the two academic medical centers of Amsterdam). It might perform differently on text from a different hospital or text from non-hospital sources (e.g. GP records).
	- The model was fine-tuned with the [Simple Transformers](https://simpletransformers.ai/) library. This library is based on Transformers but the model cannot be used directly with Transformers `pipeline` and classes; doing so would generate incorrect outputs. For this reason, the API on this page is disabled.

	## How to use
	To generate predictions with the model, use the [Simple Transformers](https://simpletransformers.ai/) library:
	```
	from simpletransformers.classification import MultiLabelClassificationModel

	model = MultiLabelClassificationModel(
	'roberta',
	'CLTL/icf-domains',
	use_cuda=False,
	)

	example = 'Nu sinds 5-6 dagen progressieve benauwdheidsklachten (bij korte stukken lopen al kortademig), terwijl dit eerder niet zo was.'
	predictions, raw_outputs = model.predict([example])
	```
	The predictions look like this:
	```
	[[1, 0, 0, 0, 0, 1, 1, 0, 0]]
	```
	The indices of the multi-label stand for:
	```
	[ADM, ATT, BER, ENR, ETN, FAC, INS, MBW, STM]
	```
	In other words, the above prediction corresponds to assigning the labels ADM, FAC and INS to the example sentence.

	The raw outputs look like this:
	```
	[[0.51907885 0.00268032 0.0030862 0.03066113 0.00616694 0.64720929
	0.67348498 0.0118863 0.0046311 ]]
	```
	For this model, the threshold at which the prediction for a label flips from 0 to 1 is 0.5.

	## Training data
	- The training data consists of clinical notes from medical records (in Dutch) of the Amsterdam UMC. Due to privacy constraints, the data cannot be released.
	- The annotation guidelines used for the project can be found [here](https://github.com/cltl/a-proof-zonmw/tree/main/resources/annotation_guidelines).

	## Training procedure
	The default training parameters of Simple Transformers were used, including:
	- Optimizer: AdamW
	- Learning rate: 4e-5
	- Num train epochs: 1
	- Train batch size: 8
	- Threshold: 0.5

	## Evaluation results
	The evaluation is done on a sentence-level (the classification unit) and on a note-level (the aggregated unit which is meaningful for the healthcare professionals).

	### Sentence-level
	\| \| ADM \| ATT \| BER \| ENR \| ETN \| FAC \| INS \| MBW \| STM
	\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---
	precision \| 0.98 \| 0.98 \| 0.56 \| 0.96 \| 0.92 \| 0.84 \| 0.89 \| 0.79 \| 0.70
	recall \| 0.49 \| 0.41 \| 0.29 \| 0.57 \| 0.49 \| 0.71 \| 0.26 \| 0.62 \| 0.75
	F1-score \| 0.66 \| 0.58 \| 0.35 \| 0.72 \| 0.63 \| 0.76 \| 0.41 \| 0.70 \| 0.72
	support \| 775 \| 39 \| 54 \| 160 \| 382 \| 253 \| 287 \| 125 \| 181

	### Note-level
	\| \| ADM \| ATT \| BER \| ENR \| ETN \| FAC \| INS \| MBW \| STM
	\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---
	precision \| 1.0 \| 1.0 \| 0.66 \| 0.96 \| 0.95 \| 0.84 \| 0.95 \| 0.87 \| 0.80
	recall \| 0.89 \| 0.56 \| 0.44 \| 0.70 \| 0.72 \| 0.89 \| 0.46 \| 0.87 \| 0.87
	F1-score \| 0.94 \| 0.71 \| 0.50 \| 0.81 \| 0.82 \| 0.86 \| 0.61 \| 0.87 \| 0.84
	support \| 231 \| 27 \| 34 \| 92 \| 165 \| 95 \| 116 \| 64 \| 94

	## Authors and references
	### Authors
	Jenia Kim, Piek Vossen

	### References
	TBD