|
|
--- |
|
|
title: README |
|
|
emoji: 🐠 |
|
|
colorFrom: pink |
|
|
colorTo: purple |
|
|
sdk: static |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# OTAR3088 NLP-model collection |
|
|
|
|
|
## Work Package 1 - Knowledge Extraction (NLP) |
|
|
|
|
|
_**Background**_ |
|
|
|
|
|
Within this working group of the greater _OTAR3088, 'Automating Knowledge Management'_ project, we aim to modernise and extend the current named entity recognition workflows of EuropePMC / Open Targets to cover an array of entity types of entities relevant to drug discovery (such as variants, biomarkers, tissues/cell types, adverse events, and assay conditions). These new entities will provide higher confidence in the relevance of a target-disease association. |
|
|
|
|
|
Since NLP models are constantly updated and fine-tuned, we have created a modular, flexible framework that facilitates the creation of new NLP models. |
|
|
|
|
|
_**OTAR3088 HuggingFace**_ |
|
|
|
|
|
This organisation space details all of the data development and model generation of the project. Data is sectioned by the greater entity-type being studied by the group at a given time, sources of data are described in the data cards. Output models are also shared here. |
|
|
|
|
|
|
|
|
_**Learn more about our project, resources and others:**_ |
|
|
|
|
|
* [OTAR3088 - The project](https://home.opentargets.org/OTAR3088) |
|
|
* [Our flexible NLP-model production pipeline](https://github.com/ML4LitS/OTAR3088) |
|
|
* [Published Papers](https://www.tandfonline.com/doi/full/10.1080/17460441.2025.2490835) |
|
|
|