Spaces:
Sleeping
Sleeping
| title: Gemma Code Generator | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.44.0 | |
| app_file: app.py | |
| pinned: false | |
| license: gemma | |
| tags: | |
| - code-generation | |
| - gemma | |
| - fine-tuned | |
| - python | |
| - qlora | |
| models: | |
| - nvhuynh16/gemma-2b-code-alpaca | |
| # π€ Gemma Code Generator | |
| Fine-tuned Gemma-2B model for Python code generation using QLoRA (Quantized Low-Rank Adaptation). | |
| ## π― Project Overview | |
| This demo showcases a fine-tuned Gemma-2B model trained on the CodeAlpaca dataset to generate Python code from natural language descriptions. | |
| ### Key Features | |
| - β‘ **Fast Training**: 4-6 hours on free Google Colab T4 GPU | |
| - π° **Cost**: $0 (using free Colab tier) | |
| - π **Performance**: Expected 75-85% syntax correctness (vs 61% baseline) | |
| - π§ **Method**: QLoRA (4-bit quantization + LoRA adapters) | |
| - π¦ **Efficient**: Only 0.12% of parameters trained (3.2M / 2.6B) | |
| ## π Model Performance | |
| | Metric | Baseline (Pretrained) | Fine-Tuned (Expected) | Improvement | | |
| |--------|----------------------|----------------------|-------------| | |
| | **Syntax Correctness** | 61.0% | 75-85% | +14-24% | | |
| | **BLEU Score** | 16.10 | 25-35 | +9-19 | | |
| | **Trainable Parameters** | N/A | 0.12% | 100x fewer | | |
| ## π οΈ Technical Details | |
| - **Base Model**: `google/gemma-2-2b-it` (2.5B parameters) | |
| - **Dataset**: CodeAlpaca-20k (3,600 training examples, 20% subset) | |
| - **Fine-tuning Method**: QLoRA | |
| - LoRA rank (r): 16 | |
| - LoRA alpha: 32 | |
| - Quantization: 4-bit NF4 | |
| - Target modules: q_proj, v_proj | |
| - **Training**: | |
| - Epochs: 2 | |
| - Batch size: 8 (2 per device Γ 4 accumulation) | |
| - Learning rate: 2e-4 | |
| - Optimizer: paged_adamw_8bit | |
| - GPU: T4 (15GB VRAM, used ~4GB) | |
| - **Framework**: PyTorch + HuggingFace Transformers + PEFT | |
| ## π» Usage | |
| ### Quick Demo | |
| Try the live demo above! Just enter a code instruction like: | |
| - "Write a function to check if a number is prime" | |
| - "Create a function to reverse a string" | |
| - "Implement binary search on a sorted list" | |
| ### Python Code | |
| ```python | |
| from huggingface_hub import InferenceClient | |
| client = InferenceClient() | |
| prompt = """### Instruction: | |
| Write a function to check if a number is prime | |
| ### Input: | |
| ### Response: | |
| """ | |
| response = client.text_generation( | |
| "nvhuynh16/gemma-2b-code-alpaca", | |
| prompt=prompt, | |
| max_new_tokens=256, | |
| temperature=0.7, | |
| ) | |
| print(response) | |
| ``` | |
| ### Load Model Directly (Requires GPU + bitsandbytes) | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig | |
| from peft import PeftModel | |
| import torch | |
| # Load base model with 4-bit quantization | |
| bnb_config = BitsAndBytesConfig( | |
| load_in_4bit=True, | |
| bnb_4bit_quant_type="nf4", | |
| bnb_4bit_compute_dtype=torch.bfloat16, | |
| ) | |
| base_model = AutoModelForCausalLM.from_pretrained( | |
| "google/gemma-2-2b-it", | |
| quantization_config=bnb_config, | |
| device_map="auto", | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it") | |
| # Load fine-tuned adapters | |
| model = PeftModel.from_pretrained(base_model, "nvhuynh16/gemma-2b-code-alpaca") | |
| # Generate code | |
| prompt = """### Instruction: | |
| Write a function to check if a number is prime | |
| ### Input: | |
| ### Response: | |
| """ | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate(**inputs, max_new_tokens=256) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ## π Use Cases | |
| - **Learning Programming**: Get code examples for educational purposes | |
| - **Prototyping**: Quickly generate boilerplate code | |
| - **Interview Preparation**: Practice coding questions | |
| - **Code Completion**: Assistance for simple functions | |
| - **Algorithm Reference**: Implementation examples | |
| ## π Training Methodology | |
| ### Dataset Preparation | |
| 1. Loaded CodeAlpaca-20k dataset | |
| 2. Filtered invalid examples | |
| 3. Formatted in Alpaca instruction style | |
| 4. Split: 90% train, 5% validation, 5% test | |
| 5. Used 20% subset (3,600 examples) for memory efficiency | |
| ### Fine-Tuning Process | |
| 1. Loaded Gemma-2B with 4-bit quantization (reduced VRAM from 10GB β 4GB) | |
| 2. Applied LoRA adapters to attention layers only | |
| 3. Trained for 2 epochs (~900 steps) | |
| 4. Automatic checkpoint upload to HuggingFace Hub | |
| 5. Total training time: 4-6 hours on free Colab T4 | |
| ### Memory Optimizations | |
| - 4-bit quantization (BitsAndBytes NF4) | |
| - LoRA adapters (0.12% trainable parameters) | |
| - Gradient checkpointing | |
| - 8-bit AdamW optimizer | |
| - Reduced sequence length (256 tokens) | |
| - Reduced batch size (2 per device) | |
| ## π Repository Structure | |
| ``` | |
| βββ notebooks/ | |
| β βββ 02_fine_tuning_with_eval.ipynb # Complete training + evaluation | |
| β βββ 03_merge_adapters.ipynb # Merge adapters (optional) | |
| βββ spaces/ | |
| β βββ app.py # This Gradio demo | |
| β βββ requirements.txt # Dependencies | |
| β βββ README.md # This file | |
| βββ scripts/ | |
| β βββ colab_quick_eval.py # Evaluation script | |
| β βββ train_local.py # Local training | |
| βββ results/ | |
| βββ baseline_100.json # Baseline evaluation | |
| ``` | |
| ## π Links | |
| - **Model**: [nvhuynh16/gemma-2b-code-alpaca](https://huggingface.co/nvhuynh16/gemma-2b-code-alpaca) | |
| - **Base Model**: [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) | |
| - **Dataset**: [CodeAlpaca-20k](https://github.com/sahil280114/codealpaca) | |
| - **GitHub**: [Project Repository](#) | |
| - **Portfolio**: [Nam Huynh](#) | |
| ## β οΈ Limitations | |
| - Primarily trained on Python code | |
| - May generate verbose explanations alongside code | |
| - Best for simple-to-moderate complexity functions | |
| - Not suitable for production without human review | |
| - Limited to patterns seen in training data | |
| ## π License | |
| This model is based on Gemma-2B-it and inherits its license. The fine-tuning adapters and this demo are provided for educational and demonstration purposes. | |
| ## π Acknowledgments | |
| - **Google**: For the Gemma model family | |
| - **Sahil Chaudhary**: For the CodeAlpaca dataset | |
| - **HuggingFace**: For Transformers, PEFT, and inference infrastructure | |
| - **Colab**: For free GPU access | |
| --- | |
| **Built for portfolio demonstration** β’ Targeting AI/ML Applied Scientist roles β’ Relevant to SAP ABAP Foundation Model team | |
| *This demo uses HuggingFace Inference API for serverless, cost-free inference* | |