You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Access is granted for non-commercial research purposes only. Users must agree to cite the IMB dataset paper in any publication or derived work using this model.

Log in or Sign Up to review the conditions and access this model content.

🧠 Gemma-2-9B-it β€” IMB Orthopedics Fine-Tuned Model

This model is a fine-tuned version of unsloth/gemma-2-9b-it, optimized for Italian medical question answering, with a specific focus on orthopedics.

The fine-tuning was performed using a subset of the IMB (Italian Medical Benchmark) dataset, specifically:

  • Orthopedics category only
  • ~10,000 training samples

The training was performed using the Unsloth library with LoRA fine-tuning, and the adapter weights were later merged into the base model to provide a standalone checkpoint.

This model relies on data from the IMB dataset. If you use this model in research or applications, you must cite the IMB paper (see Citation section below).


πŸ“š Training Dataset β€” IMB (Italian Medical Benchmark)

IMB is an Italian benchmark for medical question answering, designed to evaluate and improve LLM performance in clinical-domain Italian language understanding and reasoning.

The full dataset includes:

  • IMB-QA: 782,644 doctor-patient conversations collected from Italian online medical forums
  • IMB-MCQA: 25,862 multiple-choice questions derived from Italian medical specialization exams

⚠️ Important:
This model was trained only on the Orthopedics subset (~10k samples) of IMB, not on the full dataset.

Dataset repository:
πŸ‘‰ https://github.com/PRAISELab-PicusLab/IMB


πŸ§ͺ Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("praiselab-picuslab/gemma-2-9b-it-FT-IMB")
tokenizer = AutoTokenizer.from_pretrained("praiselab-picuslab/gemma-2-9b-it-FT-IMB")

prompt = "Quali sono i sintomi iniziali dell'artrosi del ginocchio?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Usage Restrictions

  • Allowed use: Non-commercial research only
  • Redistribution: Not allowed without explicit authorization
  • Mandatory citation: The IMB dataset paper must be cited in any publication or derived work

Access to this model may be revoked in case of license violation.


πŸ“„ Citation

If you use this model, the IMB dataset, or derived outputs in research, please cite:

@inproceedings{DBLP:conf/clic-it/RomanoRBPM25,
  author       = {Antonio Romano and
                  Giuseppe Riccio and
                  Mariano Barone and
                  Marco Postiglione and
                  Vincenzo Moscato},
  editor       = {Cristina Bosco and
                  Elisabetta Jezek and
                  Marco Polignano and
                  Manuela Sanguinetti},
  title        = {{IMB:} An Italian Medical Benchmark for Question Answering},
  booktitle    = {Proceedings of the Eleventh Italian Conference on Computational Linguistics
                  (CLiC-it 2025), Cagliari, Italy, September 24-26, 2025},
  series       = {{CEUR} Workshop Proceedings},
  volume       = {4112},
  publisher    = {CEUR-WS.org},
  year         = {2025},
  url          = {https://ceur-ws.org/Vol-4112/92_main_long.pdf}
}

πŸ— Training Details

  • Base model: unsloth/gemma-2-9b-it
  • Fine-tuning method: LoRA (Unsloth)
  • Adapter merging: Yes (Full merged model)
  • Language: Italian
  • Domain: Medical β€” Orthopedics
  • Training size: ~10K samples

πŸ“œ License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License.

CC BY-NC-ND 4.0


🀝 Acknowledgements

πŸ‘¨β€πŸ’» This project was developed by Antonio Romano, Giuseppe Riccio, Mariano Barone, Marco Postiglione, and Vincenzo Moscato at University of Naples, Federico II

Downloads last month
-
Safetensors
Model size
9B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for praiselab-picuslab/gemma-2-9b-it-FT-IMB

Adapter
(222)
this model

Dataset used to train praiselab-picuslab/gemma-2-9b-it-FT-IMB