You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

🌌 Gemma-2b-TARS-SFT: Technical Model Card

Fine-tuned Gemma-2-2B-it optimized for Creative Writing, Technical Assistance, and Distinctive Persona-Driven Chat.

Gemma-2b-TARS-SFT is a specialized large language model fine-tuned to provide high-quality, nuanced responses across both technical and creative domains. By building upon the robust reasoning capabilities of the Gemma-2 architecture, this model is specifically aligned to assist with design philosophy, coding tasks, and Hindi/English literature.

🎭 Model Persona & Roleplay

Unlike the standard, sterile base model, TARS has been fine-tuned with a distinct, slightly sarcastic, and theatrical personality (heavily inspired by science-fiction tropes).

Emotes: The model may spontaneously use action tags (e.g., *Adjusts welding goggles* or *Leans in conspiratorially*).
Persona Control: If you require strict, professional API outputs without theatrical flair, append this to your system prompt: "Do not use asterisks or theatrical actions. Provide only the direct, professional answer."

🛠 Model Details

Base Model: google/gemma-2-2b-it
Architecture: 2.6 Billion parameters
Fine-Tuning Method: 4-bit QLoRA (Quantized Low-Rank Adaptation)
Quantization: 4-bit via bitsandbytes (compressing the model mathematically to save VRAM) for highly efficient inference on consumer GPUs.
Creator: prash616 (Prashant)

📊 Training Procedure & Data

The model was developed using a Supervised Fine-Tuning (SFT) strategy. The primary goal was to enhance the model's ability to follow complex, multi-step instructions while maintaining a thoughtful, structured, and highly engaging conversational tone.

1. Datasets

Databricks Dolly-15k: Utilized to build a strong foundation in general instruction-following, brainstorming, classification, and open QA tasks.
Custom Alignment Subset: A curated dataset designed to refine the model's conversational tone and anchor its specialized focus on creative technology, poetry, and design logic.

2. Training Hyperparameters

Parameter	Value
Learning Rate	1e-4
LoRA Rank (r)	16
LoRA Alpha	32
Max Steps	300
Optimiser	AdamW (8-bit)
Target Modules	q, k, v, o, gate, up, down

🚀 Usage & Implementation (Google Colab / Python)

Prerequisites: Because this model is based on Gemma-2, you must have a Hugging Face token and accept the official Google Gemma terms of service.

1. Install Optimized Libraries:

pip install --no-deps unsloth unsloth_zoo "xformers<0.0.29" "trl<0.9.0" peft accelerate bitsandbytes
import torch
import getpass
from unsloth import FastLanguageModel

# Secure Token Input
hf_token = getpass.getpass("Enter your Hugging Face Token: ")

# Load the model in highly-efficient 4-bit mode (Critical for free-tier GPUs)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "prash616/Gemma-2b-TARS-SFT",
    max_seq_length = 2048,
    load_in_4bit = True, 
    token = hf_token, 
)
FastLanguageModel.for_inference(model) # Enables 2x faster generation

# Format the Prompt
messages = [
    {"role": "system", "content": "You are TARS, an AI assistant specialized in creative technology and literature. You were created by Prashant."},
    {"role": "user", "content": "Explain the relationship between silence and structure in poetry."}
]

# apply_chat_template automatically formats the input into Gemma's <start_of_turn> structure
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True, 
    add_generation_prompt=True, 
    return_tensors="pt",
    return_dict=True # Generates both input_ids and attention_mask
).to("cuda")

# Generate the Response
outputs = model.generate(
    **inputs, # Unpacks the dictionary to provide all necessary tensors
    max_new_tokens=256, 
    temperature=0.7,
    do_sample=True
)

print("\n--- TARS RESPONDS ---\n")
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])

Downloads last month: 4

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for prash616/Gemma-2b-TARS-SFT

Base model

google/gemma-2-2b

Finetuned

google/gemma-2-2b-it

Finetuned

(801)

this model

prash616
/

Gemma-2b-TARS-SFT