Commit
·
0d64fa1
1
Parent(s):
c7998b4
add brackets
Browse files
README.md
CHANGED
|
@@ -22,9 +22,9 @@ license: mit
|
|
| 22 |
|
| 23 |
# Model Description
|
| 24 |
|
| 25 |
-
* A RoBERTa [Liu et al., 2019](https://arxiv.org/pdf/1907.11692.pdf) model fine-tuned for de-identification of medical notes.
|
| 26 |
* Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by [HIPAA](https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html).
|
| 27 |
-
* A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions
|
| 28 |
* The PHI labels that were used for training and other details can be found here: [Annotation Guidelines](https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md)
|
| 29 |
* More details on how to use this model, the format of data and other useful information is present in the GitHub repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
|
| 30 |
|
|
@@ -41,7 +41,7 @@ license: mit
|
|
| 41 |
|
| 42 |
# Dataset
|
| 43 |
|
| 44 |
-
* The I2B2 2014 [Stubbs and Uzuner, 2015](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/) dataset was used to train this model.
|
| 45 |
|
| 46 |
| | I2B2 | | I2B2 | |
|
| 47 |
| --------- | --------------------- | ---------- | -------------------- | ---------- |
|
|
@@ -81,3 +81,7 @@ license: mit
|
|
| 81 |
|
| 82 |
|
| 83 |
## Results
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
# Model Description
|
| 24 |
|
| 25 |
+
* A RoBERTa [[Liu et al., 2019]](https://arxiv.org/pdf/1907.11692.pdf) model fine-tuned for de-identification of medical notes.
|
| 26 |
* Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by [HIPAA](https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html).
|
| 27 |
+
* A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions are aggregated to spans by making use of BILOU tagging.
|
| 28 |
* The PHI labels that were used for training and other details can be found here: [Annotation Guidelines](https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md)
|
| 29 |
* More details on how to use this model, the format of data and other useful information is present in the GitHub repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
|
| 30 |
|
|
|
|
| 41 |
|
| 42 |
# Dataset
|
| 43 |
|
| 44 |
+
* The I2B2 2014 [[Stubbs and Uzuner, 2015]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/) dataset was used to train this model.
|
| 45 |
|
| 46 |
| | I2B2 | | I2B2 | |
|
| 47 |
| --------- | --------------------- | ---------- | -------------------- | ---------- |
|
|
|
|
| 81 |
|
| 82 |
|
| 83 |
## Results
|
| 84 |
+
|
| 85 |
+
# Questions?
|
| 86 |
+
|
| 87 |
+
Post a Github issue on the repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
|