Update README.md
Browse files
README.md
CHANGED
|
@@ -13,6 +13,32 @@ tags:
|
|
| 13 |
- Segmentation
|
| 14 |
---
|
| 15 |
|
| 16 |
-
# Model Card for
|
| 17 |
|
|
|
|
|
|
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
- Segmentation
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# Model Card for SentenceSegmenter-MIMIC
|
| 17 |
|
| 18 |
+
<!-- Provide a quick summary of what the model is/does. [Optional] -->
|
| 19 |
+
This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper "Automatic sentence segmentation of clinical record narratives in real-world data". The smaple code of using this model is at "https://github.com/dongfang91/sentence_segmenter/tree/main/baseline"
|
| 20 |
|
| 21 |
+
Out segmentation model is based on "microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext", and we trained on MIMIC-III notes for a sequence labeling (token classification) task.
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
# Model Details
|
| 25 |
+
|
| 26 |
+
## Model Description
|
| 27 |
+
|
| 28 |
+
<!-- Provide a longer summary of what this model is/does. -->
|
| 29 |
+
This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper [Automatic sentence segmentation of clinical record narratives in real-world data](https://aclanthology.org/2024.emnlp-main.1156/). The smaple code of using this model is at [github](https://github.com/dongfang91/sentence_segmenter/tree/main/baseline)
|
| 30 |
+
|
| 31 |
+
Out segmentation model is based on [microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext), and we trained on MIMIC-III notes for a sequence labeling (token classification) task.
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
- **Model type:** token classification model
|
| 35 |
+
- **Language(s) (NLP):** en
|
| 36 |
+
- **Parent Model:** [microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext)
|
| 37 |
+
- **Resources for more information:** More information needed
|
| 38 |
+
[GitHub Repo](https://github.com/dongfang91/sentence_segmenter/tree/main/baseline)
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
# Citation
|
| 42 |
+
|
| 43 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
| 44 |
+
Dongfang Xu, Davy Weissenbacher, Karen O’Connor, Siddharth Rawal, and Graciela Gonzalez Hernandez. 2024. [Automatic sentence segmentation of clinical record narratives in real-world data](https://aclanthology.org/2024.emnlp-main.1156/). In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20780–20793, Miami, Florida, USA. Association for Computational Linguistics.
|