--- base_model: unsloth/gemma-3-1b-it library_name: transformers tags: - gemma-3 - fine-tuning - sft - unsloth - academic-title-generation - lora - 4bit - chat-template model_name: gemma3_1b_title_generator ---
# **Gemma 3 — 1B Academic Title Generator**
--- ## Overview **gemma3_1b_title_generator** is a fine-tuned version of `unsloth/gemma-3-1b-it`, optimized specifically for generating **academic paper titles** from scientific abstracts. The training process adapts Gemma-3's chat-format behavior to perform highly focused title generation. The model was fine-tuned using a **multi-batch training pipeline** due to hardware limitations, leveraging Unsloth’s efficient 4-bit loading and LoRA adapters. This results in a lightweight, fast, and domain-specialized model capable of producing concise, coherent, and academically accurate titles. --- ## Dataset & Preprocessing Training data consists of scientific **abstract → title** pairs. Because of memory constraints, the dataset was processed in **sequential batches**, each integrated into the model through incremental checkpoints. This collaborative batch-training approach was made possible thanks to **Unsloth’s lightweight fine-tuning tools**. Each data sample was converted into a **Gemma-3 style chat conversation**, allowing the model to learn the title as the model's response: ```python def format_dataset_for_chat(example): messages = [ {"role": "user", "content": "Generate a title for the following abstract:\n" + example["abstract"]}, {"role": "model", "content": example["title"]} ] example["text"] = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=False ).removeprefix("") return example ``` ## Chat Format Gemma-3 uses a structured multi-turn dialog format. Each training example is converted into a conversation where: - The **user** provides the abstract. - The **model** outputs the title. The structure follows the Gemma-3 chat template: user ... user content ... model ... model content ... This formatting is automatically created using Unsloth’s `tokenizer.apply_chat_template()`. Below is the preprocessing function used during fine-tuning: ```python def format_dataset_for_chat(example): messages = [ {"role": "user", "content": "Generate a title for the following abstract:\n" + example["abstract"]}, {"role": "model", "content": example["title"]} ] example["text"] = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=False ).removeprefix("") return example ``` ## Training Configuration Fine-tuning was performed using the SFTTrainer from TRL, combined with Unsloth’s efficient 4-bit loading and LoRA adaptation layers. The training process followed a multi-batch strategy due to hardware limitations, with incremental checkpoint loading supported by Unsloth. ### Key Training Settings - Model: unsloth/gemma-3-1b-it - Precision: 4-bit (QLoRA) - Method: Supervised Fine-Tuning (SFT) - LoRA: Enabled for attention and MLP modules - Sequence length: 2048 tokens - Optimizer: AdamW (8-bit) - Scheduler: cosine - Strategy: multi-batch training with checkpoint continuation - Tokenizer: Gemma-3 chat template applied through Unsloth ### Response-Only Learning To ensure the model learns **only the title** (the model output) and does not memorize the user prompt (the abstract), response-only loss masking was applied: ```python trainer = train_on_responses_only( trainer, instruction_part = "user\n", # User turn with the abstract response_part = "model\n", # Model turn with the generated title ) ``` This enforces that gradients flow exclusively through the model's output portion of the chat sequence, improving instruction-following consistency and ensuring that the LoRA adapters specialize in generating high-quality academic titles instead of learning or reproducing the user prompt. ### Training Behavior - LoRA significantly reduces VRAM usage while maintaining strong output quality. - Unsloth manages efficient 4-bit quantization, chat-template formatting, and checkpoint handling. - Multi-batch training allows large datasets to be processed even with limited hardware resources. - Validation steps are used to monitor loss and adjust training dynamics. ## 🚀 Quick Usage Example Before running inference, make sure all required libraries are installed: ```bash !pip install -q transformers accelerate torch !pip install -q -U bitsandbytes # Only if your setup or model requires Unsloth for loading: !pip install -q unsloth ``` Below is a clean and ready-to-run example demonstrating how to generate an academic title using the Gemma-3 chat template: ```python from transformers import pipeline import torch pipe = pipeline( "text-generation", model="beta3/gemma3_1b_title_generator", dtype=torch.bfloat16 ) # Example abstract for title generation abstract = """ Transformer-based architectures have demonstrated strong performance in tasks involving reasoning, scientific understanding, and text generation. Producing concise academic titles from long abstracts, however, remains a non-trivial task. """ # Construct the Gemma-3 chat-format prompt manually chat_template_prompt = ( "" "user\n" "Generate a simple title for the following abstract:\n" f"{abstract}\n" "\n" "model\n" ) # Generate the title result = pipe( chat_template_prompt, max_new_tokens=32, # Number of tokens to generate do_sample=True, # Enables sampling for more creative outputs temperature=0.7, # Controls generation randomness top_p=0.9, # Nucleus sampling return_full_text=False )[0]["generated_text"] print("Generated title:", result) ``` This example reproduces the exact Gemma-3 chat behavior and produces clean, publication-ready academic titles. ## Capabilities & Limitations ### Capabilities - Generates concise, publication-ready academic titles from scientific abstracts. - Learns to identify the core idea of long, complex abstracts. - Follows structured, instruction-based prompts using the Gemma-3 chat format. - Efficient inference thanks to 4-bit quantization and LoRA adaptation. - Performs reliably across a wide variety of scientific domains. ### Limitations - Output quality depends heavily on the clarity and structure of the abstract; vague inputs may produce generic titles. - The model does not verify factual accuracy or scientific correctness. - Performance may vary for highly domain-specific or expert-level fields requiring specialized terminology. - This model is only **1B parameters**, significantly smaller than larger Gemma or Llama variants, which means it may not always capture deep semantic details or produce titles as accurate as bigger models. - The model is optimized for academic summarization and may not generalize well to creative or conversational tasks. ## Credits This project was made possible thanks to several key open-source tools, frameworks, and community contributors: - **Unsloth** — for enabling efficient 4-bit training, LoRA integration, memory-optimized model loading, and the Gemma-3 chat template utilities. Their tooling was essential for making multi-batch fine-tuning feasible under limited hardware conditions. - **Hugging Face TRL** — for providing the SFTTrainer and the response-only training workflow, allowing the model to focus exclusively on generating high-quality titles. - **Google DeepMind** — for releasing the Gemma-3 family of models, offering a powerful instruction-tuned foundation suitable for scientific summarization and academic tasks. - **Hugging Face Transformers / Datasets** — for model loading, tokenization pipelines, and large-scale dataset management. - **Google Colab** — for generously providing free access to high-performance GPUs to the community. Their platform makes it possible for independent researchers, students, and developers to experiment with advanced large-language-model training workflows without requiring specialized hardware. Special appreciation goes to the broader open-source community for maintaining the tools, documentation, and shared knowledge that make projects like this possible. ## License This model follows the licensing terms of its upstream foundation models and tooling: - **Base Model License:** Inherits the license of `unsloth/gemma-3-1b-it`, which itself is based on Google’s *Gemma 3* licensing terms. - **Gemma 3 License:** Usage must comply with the Gemma family license provided by Google DeepMind. For details, refer to the official documentation and license terms published by Google. - **Training Frameworks:** - Unsloth (training optimizations, LoRA, 4-bit loading) - Hugging Face TRL (SFTTrainer) - Hugging Face Transformers & Datasets All these tools are used under their respective open-source licenses. **Important:** This fine-tuned model is provided *as-is* with no additional warranties. Users are responsible for ensuring compliance with applicable licenses and usage restrictions when deploying or redistributing the model. For complete details, please consult: - Google Gemma License - Unsloth Documentation & License - Hugging Face Transformers License ## Intended Use This model is intended for generating concise academic titles from research abstracts. It is **not** designed for general conversation, creative writing, or factual verification. ## Safety The model may reflect biases present in academic text sources. Outputs should be reviewed by humans before publication.