HCSCRheuma
/

Occupations

Token Classification

Spanish

Model card Files Files and versions

xet

Community

HCSCRheuma commited on Aug 1, 2023

Commit

4c2b47c

1 Parent(s): f82a166

Update README.md

Browse files

Files changed (1) hide show

README.md +227 -1

README.md CHANGED Viewed

@@ -2,4 +2,230 @@
 license: cc-by-4.0
 language:
 - es
----

 license: cc-by-4.0
 language:
 - es
+pipeline_tag: token-classification
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This model aims to recognise occupation mentions (NER) in Spanish clinical notes and to whom the occupation belongs.
+## Model Details
+<style type="text/css">
+.tg  {border-collapse:collapse;border-spacing:0;}
+.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
+  overflow:hidden;padding:10px 5px;word-break:normal;}
+.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
+  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
+.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
+</style>
+<table class="tg">
+<thead>
+  <tr>
+    <th class="tg-c3ow">PLM Model</th>
+    <th class="tg-c3ow">Learning<br>rate</th>
+    <th class="tg-c3ow">Batch size</th>
+    <th class="tg-c3ow">Epochs</th>
+    <th class="tg-c3ow">Max<br>length</th>
+    <th class="tg-c3ow">Optimizer</th>
+    <th class="tg-c3ow">Max clip<br>grad norm</th>
+    <th class="tg-c3ow">Epsilon</th>
+  </tr>
+</thead>
+<tbody>
+  <tr>
+    <td class="tg-c3ow">PlanTL-GOB-ES/<br>roberta-base-biomedical-es<br></td>
+    <td class="tg-c3ow">2e-05</td>
+    <td class="tg-c3ow">8</td>
+    <td class="tg-c3ow">10</td>
+    <td class="tg-c3ow">510</td>
+    <td class="tg-c3ow">AdamW</td>
+    <td class="tg-c3ow">1</td>
+    <td class="tg-c3ow">1e-08</td>
+  </tr>
+</tbody>
+</table>
+### Model Description
+PlanTL-GOB-ES/roberta-base-biomedical-es model was fine-tuned using MEDDOPROF corpus (Salvador Lima-López, Eulàlia Farré-Maduell, Antonio Miranda-Escalada, Vicent Briva-Iglesias, & Martin Krallinger. (2022). MEDDOPROF corpus: complete gold standard annotations for occupation detection in medical documents in Spanish [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7116201)
+Two models were built: A model for occupation recognition and a model to detect to whom the profession belongs.
+More details about this can be found in MEDDOPROF shared task:
+Lima-López, S., Farré-Maduell, E., Miranda-Escalada, A., Brivá-Iglesias, V., & Krallinger, M. (2021). Nlp applied to occupational health: Meddoprof shared task at iberlef 2021 on automatic recognition, classification and normalization of professions and occupations from medical texts. Procesamiento del Lenguaje Natural, 67, 243-256.
+- **Developed by:** Alfredo Madrid
+- **Language(s) (NLP):** Spanish
+- **License:** CC4.0
+- **Finetuned from model [optional]:** PlanTL-GOB-ES/roberta-base-biomedical-es
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://huggingface.co/HCSCRheuma/Occupations
+- **Paper [optional]:** Madrid García, A. (2023). Recognition of professions in medical documentation.
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Data Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]