beta3's picture
Upload fine-tuned model directly from Google Drive
17ec7c0 verified
---
base_model: unsloth/gemma-3-1b-it
library_name: transformers
tags:
- gemma-3
- fine-tuning
- sft
- unsloth
- academic-title-generation
- lora
- 4bit
- chat-template
model_name: gemma3_1b_title_generator
---
<center>
# **Gemma 3 — 1B Academic Title Generator**
<img src="https://www.geeky-gadgets.com/wp-content/uploads/2025/03/google-gemma-3-advanced-ai-models.webp" width="600"/>
</center>
---
## Overview
**gemma3_1b_title_generator** is a fine-tuned version of `unsloth/gemma-3-1b-it`, optimized specifically for generating **academic paper titles** from scientific abstracts.
The training process adapts Gemma-3's chat-format behavior to perform highly focused title generation. The model was fine-tuned using a **multi-batch training pipeline** due to hardware limitations, leveraging Unsloth’s efficient 4-bit loading and LoRA adapters.
This results in a lightweight, fast, and domain-specialized model capable of producing concise, coherent, and academically accurate titles.
---
## Dataset & Preprocessing
Training data consists of scientific **abstract → title** pairs.
Because of memory constraints, the dataset was processed in **sequential batches**, each integrated into the model through incremental checkpoints. This collaborative batch-training approach was made possible thanks to **Unsloth’s lightweight fine-tuning tools**.
Each data sample was converted into a **Gemma-3 style chat conversation**, allowing the model to learn the title as the model's response:
```python
def format_dataset_for_chat(example):
messages = [
{"role": "user", "content": "Generate a title for the following abstract:\n" + example["abstract"]},
{"role": "model", "content": example["title"]}
]
example["text"] = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=False
).removeprefix("<bos>")
return example
```
## Chat Format
Gemma-3 uses a structured multi-turn dialog format.
Each training example is converted into a conversation where:
- The **user** provides the abstract.
- The **model** outputs the title.
The structure follows the Gemma-3 chat template:
<bos><start_of_turn>user
... user content ...
<end_of_turn>
<start_of_turn>model
... model content ...
<end_of_turn>
This formatting is automatically created using Unsloth’s
`tokenizer.apply_chat_template()`.
Below is the preprocessing function used during fine-tuning:
```python
def format_dataset_for_chat(example):
messages = [
{"role": "user", "content": "Generate a title for the following abstract:\n" + example["abstract"]},
{"role": "model", "content": example["title"]}
]
example["text"] = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=False
).removeprefix("<bos>")
return example
```
## Training Configuration
Fine-tuning was performed using the SFTTrainer from TRL, combined with Unsloth’s
efficient 4-bit loading and LoRA adaptation layers. The training process followed
a multi-batch strategy due to hardware limitations, with incremental checkpoint
loading supported by Unsloth.
### Key Training Settings
- Model: unsloth/gemma-3-1b-it
- Precision: 4-bit (QLoRA)
- Method: Supervised Fine-Tuning (SFT)
- LoRA: Enabled for attention and MLP modules
- Sequence length: 2048 tokens
- Optimizer: AdamW (8-bit)
- Scheduler: cosine
- Strategy: multi-batch training with checkpoint continuation
- Tokenizer: Gemma-3 chat template applied through Unsloth
### Response-Only Learning
To ensure the model learns **only the title** (the model output) and does not
memorize the user prompt (the abstract), response-only loss masking was applied:
```python
trainer = train_on_responses_only(
trainer,
instruction_part = "<start_of_turn>user\n", # User turn with the abstract
response_part = "<start_of_turn>model\n", # Model turn with the generated title
)
```
This enforces that gradients flow exclusively through the model's output portion
of the chat sequence, improving instruction-following consistency and ensuring
that the LoRA adapters specialize in generating high-quality academic titles
instead of learning or reproducing the user prompt.
### Training Behavior
- LoRA significantly reduces VRAM usage while maintaining strong output quality.
- Unsloth manages efficient 4-bit quantization, chat-template formatting, and
checkpoint handling.
- Multi-batch training allows large datasets to be processed even with limited
hardware resources.
- Validation steps are used to monitor loss and adjust training dynamics.
## 🚀 Quick Usage Example
Before running inference, make sure all required libraries are installed:
```bash
!pip install -q transformers accelerate torch
!pip install -q -U bitsandbytes
# Only if your setup or model requires Unsloth for loading:
!pip install -q unsloth
```
Below is a clean and ready-to-run example demonstrating how to generate an
academic title using the Gemma-3 chat template:
```python
from transformers import pipeline
import torch
pipe = pipeline(
"text-generation",
model="beta3/gemma3_1b_title_generator",
dtype=torch.bfloat16
)
# Example abstract for title generation
abstract = """
Transformer-based architectures have demonstrated strong performance in tasks
involving reasoning, scientific understanding, and text generation. Producing
concise academic titles from long abstracts, however, remains a non-trivial task.
"""
# Construct the Gemma-3 chat-format prompt manually
chat_template_prompt = (
"<bos>"
"<start_of_turn>user\n"
"Generate a simple title for the following abstract:\n"
f"{abstract}\n"
"<end_of_turn>\n"
"<start_of_turn>model\n"
)
# Generate the title
result = pipe(
chat_template_prompt,
max_new_tokens=32, # Number of tokens to generate
do_sample=True, # Enables sampling for more creative outputs
temperature=0.7, # Controls generation randomness
top_p=0.9, # Nucleus sampling
return_full_text=False
)[0]["generated_text"]
print("Generated title:", result)
```
This example reproduces the exact Gemma-3 chat behavior and produces clean,
publication-ready academic titles.
## Capabilities & Limitations
### Capabilities
- Generates concise, publication-ready academic titles from scientific abstracts.
- Learns to identify the core idea of long, complex abstracts.
- Follows structured, instruction-based prompts using the Gemma-3 chat format.
- Efficient inference thanks to 4-bit quantization and LoRA adaptation.
- Performs reliably across a wide variety of scientific domains.
### Limitations
- Output quality depends heavily on the clarity and structure of the abstract; vague inputs may produce generic titles.
- The model does not verify factual accuracy or scientific correctness.
- Performance may vary for highly domain-specific or expert-level fields requiring specialized terminology.
- This model is only **1B parameters**, significantly smaller than larger Gemma or Llama variants, which means it may not always capture deep semantic details or produce titles as accurate as bigger models.
- The model is optimized for academic summarization and may not generalize well to creative or conversational tasks.
## Credits
This project was made possible thanks to several key open-source tools,
frameworks, and community contributors:
- **Unsloth** — for enabling efficient 4-bit training, LoRA integration,
memory-optimized model loading, and the Gemma-3 chat template utilities.
Their tooling was essential for making multi-batch fine-tuning feasible
under limited hardware conditions.
- **Hugging Face TRL** — for providing the SFTTrainer and the
response-only training workflow, allowing the model to focus exclusively
on generating high-quality titles.
- **Google DeepMind** — for releasing the Gemma-3 family of models,
offering a powerful instruction-tuned foundation suitable for scientific
summarization and academic tasks.
- **Hugging Face Transformers / Datasets** — for model loading,
tokenization pipelines, and large-scale dataset management.
- **Google Colab** — for generously providing free access to high-performance
GPUs to the community. Their platform makes it possible for independent
researchers, students, and developers to experiment with advanced
large-language-model training workflows without requiring specialized
hardware.
Special appreciation goes to the broader open-source community for maintaining
the tools, documentation, and shared knowledge that make projects like this
possible.
## License
This model follows the licensing terms of its upstream foundation models and
tooling:
- **Base Model License:** Inherits the license of
`unsloth/gemma-3-1b-it`, which itself is based on Google’s *Gemma 3*
licensing terms.
- **Gemma 3 License:** Usage must comply with the Gemma family license
provided by Google DeepMind. For details, refer to the official documentation
and license terms published by Google.
- **Training Frameworks:**
- Unsloth (training optimizations, LoRA, 4-bit loading)
- Hugging Face TRL (SFTTrainer)
- Hugging Face Transformers & Datasets
All these tools are used under their respective open-source licenses.
**Important:**
This fine-tuned model is provided *as-is* with no additional warranties. Users
are responsible for ensuring compliance with applicable licenses and usage
restrictions when deploying or redistributing the model.
For complete details, please consult:
- Google Gemma License
- Unsloth Documentation & License
- Hugging Face Transformers License
## Intended Use
This model is intended for generating concise academic titles from research
abstracts. It is **not** designed for general conversation, creative writing,
or factual verification.
## Safety
The model may reflect biases present in academic text sources. Outputs should
be reviewed by humans before publication.