|
|
--- |
|
|
tags: |
|
|
- biogpt |
|
|
- boolean-query |
|
|
- biomedical |
|
|
- systematic-review |
|
|
- pubmed |
|
|
license: unknown |
|
|
model-index: |
|
|
- name: BioGPT-BQF-TMK-Small |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
name: Text Generation |
|
|
dataset: |
|
|
name: CLEF TAR |
|
|
type: biomedical |
|
|
metrics: |
|
|
- name: Precision @100 |
|
|
type: precision |
|
|
value: 0.1340 |
|
|
- name: Recall @1000 |
|
|
type: recall |
|
|
value: 0.2125 |
|
|
--- |
|
|
|
|
|
# **BioGPT-BQF-TMK-Small** |
|
|
A fine-tuned **BioGPT** model for **Boolean query formalization in biomedical systematic reviews**, incorporating **Titles, MeSH Terms, and Keywords** to improve **PubMed search query generation**. |
|
|
|
|
|
## **Model Overview** |
|
|
- **Base Model**: [BioGPT](https://huggingface.co/microsoft/BioGPT) |
|
|
- **Fine-tuned on**: Semi-synthetic generated data |
|
|
- **Task**: Boolean Query Generation for PubMed searches |
|
|
- **Inputs**: Research topic title, MeSH terms, and Keywords |
|
|
- **Outputs**: Optimized PubMed Boolean search query |
|
|
|
|
|
## **Usage** |
|
|
```python |
|
|
from transformers import BioGptForCausalLM, BioGptTokenizer |
|
|
|
|
|
model_name = "AI4BSLR/BioGPT-BQF-TMK-Small" |
|
|
model = BioGptForCausalLM.from_pretrained(model_name) |
|
|
tokenizer = BioGptTokenizer.from_pretrained(model_name) |
|
|
|
|
|
input_text = "Title: Heterogeneity in Lung Cancer, MeSH: Biomarkers, Tumor, Genetic Heterogeneity, Keywords: Biomarkers, Query: " |
|
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
outputs = model.generate(**inputs) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|