obi
/

deid_roberta_i2b2

Token Classification

deidentification

Model card Files Files and versions

prajwal967 commited on Feb 16, 2022

Commit

0d64fa1

·

1 Parent(s): c7998b4

add brackets

Files changed (1) hide show

README.md +7 -3

README.md CHANGED Viewed

@@ -22,9 +22,9 @@ license: mit
 # Model Description
-* A RoBERTa [Liu et al., 2019](https://arxiv.org/pdf/1907.11692.pdf) model fine-tuned for de-identification of medical notes.
 * Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by [HIPAA](https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html).
-* A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions can be aggregated to span (e.g., making use of BILOU tagging).
 * The PHI labels that were used for training and other details can be found here: [Annotation Guidelines](https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md)
 * More details on how to use this model, the format of data and other useful information is present in the GitHub repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
@@ -41,7 +41,7 @@ license: mit
 # Dataset
-* The I2B2 2014 [Stubbs and Uzuner, 2015](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/) dataset was used to train this model.
 |           | I2B2                  |            |  I2B2                |            |
 | --------- | --------------------- | ---------- | -------------------- | ---------- |
@@ -81,3 +81,7 @@ license: mit
 ## Results

 # Model Description
+* A RoBERTa [[Liu et al., 2019]](https://arxiv.org/pdf/1907.11692.pdf) model fine-tuned for de-identification of medical notes.
 * Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by [HIPAA](https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html).
+* A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions are aggregated to spans by making use of BILOU tagging.
 * The PHI labels that were used for training and other details can be found here: [Annotation Guidelines](https://github.com/obi-ml-public/ehr_deidentification/blob/master/AnnotationGuidelines.md)
 * More details on how to use this model, the format of data and other useful information is present in the GitHub repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).
 # Dataset
+* The I2B2 2014 [[Stubbs and Uzuner, 2015]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978170/) dataset was used to train this model.
 |           | I2B2                  |            |  I2B2                |            |
 | --------- | --------------------- | ---------- | -------------------- | ---------- |
 ## Results
+# Questions?
+Post a Github issue on the repo: [Robust DeID](https://github.com/obi-ml-public/ehr_deidentification).