dongfangxu's picture
Update README.md
cd4d94f verified
|
raw
history blame
2.07 kB
metadata
license: mit
language:
  - en
metrics:
  - f1
base_model:
  - microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext
pipeline_tag: token-classification
tags:
  - clinical
  - MIMIC-III
  - Segmentation

Model Details

Model Description

This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper Automatic sentence segmentation of clinical record narratives in real-world data. The smaple code of using this model is at github

Out segmentation model is based on microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext, and we trained on MIMIC-III notes for a sequence labeling (token classification) task.

Citation

Dongfang Xu, Davy Weissenbacher, Karen O’Connor, Siddharth Rawal, and Graciela Gonzalez Hernandez. 2024. Automatic sentence segmentation of clinical record narratives in real-world data. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20780–20793, Miami, Florida, USA. Association for Computational Linguistics.