| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | tags: |
| | - Token Classification |
| | widget: |
| | - text: "Monitored Natural Attenuation and, if necessary as a contingency, In Situ Chemical Oxidation to address the injection of a strong chemical oxidant to chemically treat the before the contingency can be implemented at the spill site." |
| | example_title: "example 1" |
| | - text: "Site was identified as a potential source of groundwater contamination after the City performed Assessments were investigated further for potential contamination." |
| | example_title: "example 2" |
| | - text: "Chromium releases from the UST is probably a major contributor to groundwater contamination in this area." |
| | example_title: "example 3" |
| | --- |
| | ## About the Model |
| | An Environmental Named Entity Recognition model, trained on dataset from USEPA to recognize environmental due diligence (7 entities) from a given text corpus (remediation reports, record of decision, 5 year record etc). This model was built on top of distilbert-base-uncased |
| |
|
| | - Dataset: https://data.mendeley.com/datasets/tx6vmd4g9p/4 |
| | - Dataset Reasearch Paper: https://doi.org/10.1016/j.dib.2022.108579 |
| |
|
| | ## Usage |
| | The easiest way is to load the inference api from huggingface and second method is through the pipeline object offered by transformers library. |
| | ```python |
| | |
| | # Use a pipeline as a high-level helper |
| | from transformers import pipeline |
| | pipe = pipeline("token-classification", model="d4data/EnviDueDiligence_NER") |
| | |
| | # Load model directly |
| | from transformers import AutoTokenizer, AutoModelForTokenClassification |
| | tokenizer = AutoTokenizer.from_pretrained("d4data/EnviDueDiligence_NER") |
| | model = AutoModelForTokenClassification.from_pretrained("d4data/EnviDueDiligence_NER") |
| | |
| | ``` |
| |
|
| | ## Author |
| | This model is part of the Research topic "Environmental Due Diligence" conducted by Deepak John Reji, Afreen Aman. If you use this work (code, model or dataset), please cite: |
| | > Aman, A. and Reji, D.J., 2022. EnvBert: An NLP model for Environmental Due Diligence data classification. Software Impacts, 14, p.100427. |
| |
|
| | ## You can support me here :) |
| | <a href="https://www.buymeacoffee.com/deepakjohnreji" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a> |
| |
|
| |
|