YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card: Gemma Sprint 2024 - Brain Neural Activation Simulation

Model Overview

This model is a fine-tuned version of google/gemma-2-2b-it, optimized to simulate brain neural activations and provide answers to neuroscience-related questions. The model was fine-tuned on the PubMedQA dataset using LoRA (Low-Rank Adaptation) to improve performance on brain-related question-answering tasks. This project focuses on simulating brain circuit activation patterns and generating relevant answers in the domain of neuroscience.

Model Architecture

  • Base Model: google/gemma-2-2b-it
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)

Configurations:

  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Dropout: 0.1
  • Target Modules: q_proj, k_proj, v_proj, o_proj

Dataset

The model was trained on the PubMedQA dataset, which contains biomedical questions and detailed long-form answers based on PubMed abstracts. PubMedQA is specifically designed for building models that can handle complex, long-answer question-answering tasks in the biomedical domain, making it suitable for neuroscience-related queries as well.

Data Preprocessing

For training, each question and its corresponding long answer from the PubMedQA dataset were preprocessed into input and label formats. The inputs were tokenized with padding and truncation at 512 tokens to fit the model's requirements.

Model Performance

The model's performance was evaluated using BLEU and ROUGE scores:

  • BLEU Score: Measures the similarity between generated and reference answers.
  • ROUGE Score: Measures the overlap of n-grams between generated and reference answers.

These metrics were computed on the PubMedQA test set. Performance on out-of-domain data may vary.

Limitations

  • This model was trained on the PubMedQA dataset, so it may underperform on out-of-domain data.
  • Since no Korean data was used in training, the model may not perform well in Korean question-answering tasks.

License

This model follows the license of google/gemma-2-2b-it. Please refer to the original license for any usage restrictions.

How to Use

Here’s how you can load and use the Gemma Sprint 2024 model fine-tuned on the PubMedQA dataset: https://www.kaggle.com/code/calispohwang/gemma-sprint-brain


Downloads last month
-
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support