File size: 1,490 Bytes
1eb3837
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b57d428
1eb3837
 
 
 
b57d428
1eb3837
 
b57d428
1eb3837
 
 
 
 
 
 
6ac21e1
1eb3837
 
 
 
 
 
 
 
f65d635
1eb3837
 
 
418cf15
1eb3837
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
tags:
- biogpt
- boolean-query
- biomedical
- systematic-review
- pubmed
license: unknown
model-index:
- name: BioGPT-BQF-TMK-Small
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: CLEF TAR
      type: biomedical
    metrics:
      - name: Precision @100
        type: precision
        value: 0.1340
      - name: Recall @1000
        type: recall
        value: 0.2125
---

# **BioGPT-BQF-TMK-Small**
A fine-tuned **BioGPT** model for **Boolean query formalization in biomedical systematic reviews**, incorporating **Titles, MeSH Terms, and Keywords** to improve **PubMed search query generation**.

## **Model Overview**
- **Base Model**: [BioGPT](https://huggingface.co/microsoft/BioGPT)
- **Fine-tuned on**: Semi-synthetic generated data
- **Task**: Boolean Query Generation for PubMed searches
- **Inputs**: Research topic title, MeSH terms, and Keywords
- **Outputs**: Optimized PubMed Boolean search query

## **Usage**
```python
from transformers import BioGptForCausalLM, BioGptTokenizer

model_name = "AI4BSLR/BioGPT-BQF-TMK-Small"
model = BioGptForCausalLM.from_pretrained(model_name)
tokenizer = BioGptTokenizer.from_pretrained(model_name)

input_text = "Title: Heterogeneity in Lung Cancer, MeSH: Biomarkers, Tumor, Genetic Heterogeneity, Keywords: Biomarkers, Query: "
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))