dongfangxu commited on
Commit
89ef54c
·
verified ·
1 Parent(s): 4d9bf68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -13,6 +13,32 @@ tags:
13
  - Segmentation
14
  ---
15
 
16
- # Model Card for {{ SentenceSegmenter-MIMIC | default("Model ID", true) }}
17
 
 
 
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - Segmentation
14
  ---
15
 
16
+ # Model Card for SentenceSegmenter-MIMIC
17
 
18
+ <!-- Provide a quick summary of what the model is/does. [Optional] -->
19
+ This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper &#34;Automatic sentence segmentation of clinical record narratives in real-world data&#34;. The smaple code of using this model is at &#34;https://github.com/dongfang91/sentence_segmenter/tree/main/baseline&#34;
20
 
21
+ Out segmentation model is based on &#34;microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext&#34;, and we trained on MIMIC-III notes for a sequence labeling (token classification) task.
22
+
23
+
24
+ # Model Details
25
+
26
+ ## Model Description
27
+
28
+ <!-- Provide a longer summary of what this model is/does. -->
29
+ This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper [Automatic sentence segmentation of clinical record narratives in real-world data](https://aclanthology.org/2024.emnlp-main.1156/). The smaple code of using this model is at [github](https://github.com/dongfang91/sentence_segmenter/tree/main/baseline)
30
+
31
+ Out segmentation model is based on [microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext), and we trained on MIMIC-III notes for a sequence labeling (token classification) task.
32
+
33
+
34
+ - **Model type:** token classification model
35
+ - **Language(s) (NLP):** en
36
+ - **Parent Model:** [microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext)
37
+ - **Resources for more information:** More information needed
38
+ [GitHub Repo](https://github.com/dongfang91/sentence_segmenter/tree/main/baseline)
39
+
40
+
41
+ # Citation
42
+
43
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
44
+ Dongfang Xu, Davy Weissenbacher, Karen O’Connor, Siddharth Rawal, and Graciela Gonzalez Hernandez. 2024. [Automatic sentence segmentation of clinical record narratives in real-world data](https://aclanthology.org/2024.emnlp-main.1156/). In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20780–20793, Miami, Florida, USA. Association for Computational Linguistics.