Gemma 2B Fine-tuned with LoRA

This is a fine-tuned version of unsloth/gemma-2b-it-bnb-4bit using LoRA (Low-Rank Adaptation).

Model Details

  • Base Model: unsloth/gemma-2b-it-bnb-4bit
  • Method: LoRA fine-tuning with PEFT
  • Quantization: 4-bit (via bitsandbytes)
  • Framework: Transformers + PEFT
  • Model Type: Causal Language Model
  • Language: English
  • License: Apache 2.0

Usage

Installation

pip install transformers peft torch accelerate bitsandbytes

Basic Inference

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained("venkat1701/gemma-2b-finetuned")
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/gemma-2b-it-bnb-4bit",
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "venkat1701/gemma-2b-finetuned")
model.eval()

# Generate text
prompt = "Explain quantum computing in simple terms:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_length=256,
        temperature=0.7,
        top_p=0.9,
        do_sample=True
    )

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Using Hugging Face Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/venkat1701/gemma-2b-finetuned"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({
    "inputs": "Explain quantum computing:",
    "parameters": {"max_length": 256, "temperature": 0.7}
})
print(output)

Deployment

The model includes a FastAPI server and Docker support for easy deployment. See the repository for deployment scripts.

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Framework versions

  • PEFT 0.18.0
Downloads last month
2
Safetensors
Model size
2B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for venkat1701/gemma-2b-finetuned

Adapter
(13)
this model

Paper for venkat1701/gemma-2b-finetuned