Spaces:

rahul7star
/

Train-Lora

Running

App Files Files Community

rahul7star commited on Nov 10, 2025

Commit

9985acb

verified ·

1 Parent(s): 2dca78b

Update app_gpu.py

Browse files

Files changed (1) hide show

app_gpu.py +141 -162

app_gpu.py CHANGED Viewed

@@ -252,182 +252,161 @@ def generate_long_prompt_cpu(base_model, lora_repo, short_prompt, max_length=200
 import gradio as gr
 def run_ui():
-    with gr.Blocks() as demo:
-        gr.Markdown("# 🌐 Universal Dynamic LoRA Trainer & Inference")
-        # ---------------- Train LoRA Tab ----------------
-        with gr.Tab("Train LoRA"):
-            with gr.Row():
-                base_model = gr.Textbox(label="Base model", value="google/gemma-3-4b-it")
-                dataset = gr.Textbox(label="Dataset folder or HF repo", value="rahul7star/prompt-enhancer-dataset-01")
-                csvname = gr.Textbox(label="CSV/Parquet file", value="train-00000-of-00001.csv")
-                short_col = gr.Textbox(label="Short prompt column", value="short_prompt")
-                long_col = gr.Textbox(label="Long prompt column", value="long_prompt")
-                repo = gr.Textbox(label="HF repo to upload LoRA", value="rahul7star/gemma-3-270m-ccebc0")
-            with gr.Row():
-                batch_size = gr.Number(value=1, label="Batch size")
-                num_workers = gr.Number(value=0, label="DataLoader num_workers")
-                r = gr.Number(value=8, label="LoRA rank")
-                a = gr.Number(value=16, label="LoRA alpha")
-                ep = gr.Number(value=1, label="Epochs")
-                lr = gr.Number(value=1e-4, label="Learning rate")
-                max_records = gr.Number(value=1000, label="Max training records")
-            logs = gr.Textbox(label="Logs (streaming)", lines=25)
-            def launch_train(bm, ds, csv, sc, lc, batch, num_w, r_, a_, ep_, lr_, max_rec, repo_):
-                gen = train_lora_stream(
-                    bm, ds, csv, [sc, lc],
-                    epochs=int(ep_), lr=float(lr_), r=int(r_), alpha=int(a_),
-                    batch_size=int(batch), num_workers=int(num_w),
-                    max_train_records=int(max_rec), hf_repo_id=repo_
                 )
-                for item in gen:
-                    yield item
-            btn = gr.Button("🚀 Start Training")
-            btn.click(
-                fn=launch_train,
-                inputs=[
-                    base_model, dataset, csvname, short_col, long_col,
-                    batch_size, num_workers, r, a, ep, lr, max_records, repo
-                ],
-                outputs=[logs],
-                queue=True
-            )
-        # ---------------- Inference (CPU) Tab ----------------
-        with gr.Tab("Inference (CPU)"):
-            inf_base_model = gr.Textbox(label="Base model", value="google/gemma-3-4b-it")
-            inf_lora_repo = gr.Textbox(label="LoRA HF repo", value="rahul7star/gemma-3-270m-ccebc0")
-            short_prompt = gr.Textbox(label="Short prompt")
-            long_prompt_out = gr.Textbox(label="Generated long prompt", lines=5)
-            inf_btn = gr.Button("📝 Generate Long Prompt")
-            inf_btn.click(
-                fn=generate_long_prompt_cpu,
-                inputs=[inf_base_model, inf_lora_repo, short_prompt],
-                outputs=[long_prompt_out]
-            )
-        # ---------------- Code Explain Tab ----------------
-        with gr.Tab("Code Explain"):
-            explain_md = gr.Markdown("""
-### 🧩 Universal Dynamic LoRA Trainer & Inference — Code Explanation
-This project provides an **end-to-end LoRA fine-tuning and inference system** for language models like **Gemma**, built with **Gradio**, **PEFT**, and **Accelerate**.
-It supports both **training new LoRAs** and **generating text** with existing ones — all in a single interface.
----
-#### **1️⃣ Imports Overview**
-- **Core libs:** `os`, `torch`, `gradio`, `numpy`, `pandas`
-- **Training libs:** `peft` (`LoraConfig`, `get_peft_model`), `accelerate` (`Accelerator`)
-- **Modeling:** `transformers` (for Gemma base model)
-- **Hub integration:** `huggingface_hub` (for uploading adapters)
-- **Spaces:** `spaces` — for execution within Hugging Face Spaces
----
-#### **2️⃣ Dataset Loading**
-- Uses a lightweight **MediaTextDataset** class to load:
-  - CSV / Parquet files
-  - or directly from a Hugging Face dataset repo
-- Expects two columns:
-  `short_prompt` → Input text
-  `long_prompt` → Target expanded text
-- Supports batching, missing-column checks, and configurable max record limits.
----
-#### **3️⃣ Model Loading & Preparation**
-- Loads **Gemma model and tokenizer** via `AutoModelForCausalLM` and `AutoTokenizer`.
-- Automatically detects **target modules** (e.g. `q_proj`, `v_proj`) for LoRA injection.
-- Supports `float16` or `bfloat16` precision with `Accelerator` for optimal memory usage.
----
-#### **4️⃣ LoRA Training Logic**
-- Core formula:
-  \[
-  W_{eff} = W + \alpha \times (B @ A)
-  \]
-- Only **A** and **B** matrices are trainable; base model weights remain frozen.
-- Configurable parameters:
-  `r` (rank), `alpha` (scaling), `epochs`, `lr`, `batch_size`
-- Training logs stream live in the UI, showing step-by-step loss values.
-- After training, the adapter is **saved locally** and **uploaded to Hugging Face Hub**.
----
-#### **5️⃣ CPU Inference Mode**
-- Runs entirely on **CPU**, no GPU required.
-- Loads base Gemma model + trained LoRA weights (`PeftModel.from_pretrained`).
-- Optionally merges LoRA with base model.
-- Expands the short prompt → long descriptive text using standard generation parameters (e.g., top-p / top-k sampling).
----
-#### **6️⃣ LoRA Internals Explained**
-- LoRA injects low-rank matrices (A, B) into **attention Linear layers**.
-- Example:
-  \[
-  Q_{new} = Q + \alpha \times (B @ A)
-  \]
-- Significantly reduces training cost:
-  - Memory: ~1–2% of full model
-  - Compute: trains faster with minimal GPU load
-- Scalable to large models like Gemma 3B / 4B with rank ≤ 16.
 ---
-#### **7️⃣ Gradio UI Structure**
-- **Train LoRA Tab:**
-  Configure model, dataset, LoRA parameters, and upload target.
-  Press **🚀 Start Training** to stream training logs live.
-- **Inference (CPU) Tab:**
-  Type a short prompt → Generates expanded long-form version via trained LoRA.
-- **Code Explain Tab:**
-  Detailed breakdown of logic + simulated console output below.
----
-### 🧾 Example Log Simulation
 ```python
-print(f"[INFO] Loading base model: {base_model}")
-# -> Loads Gemma base model (fp16) on CUDA
-# [INFO] Base model google/gemma-3-4b-it loaded successfully
-print(f"[INFO] Preparing dataset from: {dataset_path}")
-# -> Loads dataset or CSV file
-# [DATA] 980 samples loaded, columns: short_prompt, long_prompt
-print("[INFO] Initializing LoRA configuration...")
-# -> Creates LoraConfig(r=8, alpha=16, target_modules=['q_proj', 'v_proj'])
-# [CONFIG] LoRA applied to 96 attention layers
-print("[INFO] Starting training loop...")
-# [TRAIN] Step 1 | Loss: 2.31
-# [TRAIN] Step 50 | Loss: 1.42
-# [TRAIN] Step 100 | Loss: 0.91
-# [TRAIN] Epoch 1 complete (avg loss: 1.21)
-print("[INFO] Saving LoRA adapter...")
-# -> Saves safetensors and config locally
-print(f"[UPLOAD] Pushing adapter to {hf_repo_id}")
-# -> Uploads model to Hugging Face Hub
-# [UPLOAD] adapter_model.safetensors (67.7 MB)
-# [SUCCESS] LoRA uploaded successfully 🚀
-""")
     return demo

 import gradio as gr
 def run_ui():
+    with gr.Blocks(title="Prompt Enhancer Trainer + Inference UI") as demo:
+        gr.Markdown("# ✨ Prompt Enhancer Trainer + Inference Playground")
+        gr.Markdown("Train, test, and debug your LoRA-enhanced Gemma model easily.")
+        with gr.Tabs():
+            # -------------------------------
+            # 1️⃣ TRAINING TAB
+            # -------------------------------
+            with gr.Tab("Train Model"):
+                with gr.Row():
+                    base_model = gr.Textbox(label="Base Model", value="google/gemma-2b-it")
+                    dataset_path = gr.Textbox(label="Dataset Folder (Path)")
+                    repo_id = gr.Textbox(label="Upload HF Repo (optional)", placeholder="username/my-enhancer-model")
+                with gr.Row():
+                    output_dir = gr.Textbox(label="Local Output Directory", value="/tmp/prompt-enhancer")
+                    train_btn = gr.Button("🚀 Start Training")
+                train_log = gr.Textbox(label="Training Log", lines=20)
+                def train_model_ui(base_model, dataset_path, repo_id, output_dir):
+                    return train_model(base_model, dataset_path, repo_id, output_dir)
+                train_btn.click(
+                    train_model_ui,
+                    inputs=[base_model, dataset_path, repo_id, output_dir],
+                    outputs=[train_log],
                 )
+            # -------------------------------
+            # 2️⃣ INFERENCE TAB (CPU)
+            # -------------------------------
+            with gr.Tab("Inference (CPU Mode)"):
+                with gr.Row():
+                    model_repo = gr.Textbox(label="HF Model Repo", value="gokaygokay/prompt-enhancer-gemma-3-270m-it")
+                    user_prompt = gr.Textbox(label="Enter a short prompt", placeholder="a cat sitting on a chair")
+                    gen_btn = gr.Button("🧠 Generate Enhanced Prompt")
+                result_box = gr.Textbox(label="Enhanced Prompt", lines=10)
+                def run_inference(model_repo, user_prompt):
+                    import torch
+                    from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
+                    device = "cpu"
+                    model = AutoModelForCausalLM.from_pretrained(model_repo, torch_dtype=torch.float32, device_map={"": device})
+                    tokenizer = AutoTokenizer.from_pretrained(model_repo)
+                    pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map={"": device})
+                    messages = [
+                        {"role": "system", "content": "Enhance and expand the following prompt with more details and context:"},
+                        {"role": "user", "content": user_prompt},
+                    ]
+                    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+                    output = pipe(prompt, max_new_tokens=256)
+                    return output[0]['generated_text']
+                gen_btn.click(run_inference, inputs=[model_repo, user_prompt], outputs=[result_box])
+            # -------------------------------
+            # 3️⃣ SHOW TRAINABLE PARAMS TAB
+            # -------------------------------
+            with gr.Tab("Show Trainable Params"):
+                gr.Markdown("### 🧩 View Trainable Parameters in Your LoRA-Enhanced Model")
+                with gr.Row():
+                    base_model_name = gr.Textbox(label="Base Model", value="google/gemma-2b-it")
+                    check_btn = gr.Button("🔍 Show Trainable Layers")
+                param_output = gr.Textbox(label="Trainable Parameters Info", lines=25)
+                def show_trainable_layers(base_model_name):
+                    import torch
+                    from peft import get_peft_model, LoraConfig
+                    from transformers import AutoModelForCausalLM
+                    model = AutoModelForCausalLM.from_pretrained(base_model_name)
+                    config = LoraConfig(r=16, lora_alpha=32, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"])
+                    model = get_peft_model(model, config)
+                    model.print_trainable_parameters()
+                    return (
+                        "Each 'Adapter (90)' means 90 LoRA layers (pairs of A/B matrices) were injected.\n\n"
+                        "🧠 These typically correspond to:\n"
+                        "- q_proj, k_proj, v_proj → Query, Key, Value projections\n"
+                        "- o_proj or out_proj → Output of attention\n"
+                        "- gate_proj, up_proj, down_proj → Feed-forward layers\n\n"
+                        "💡 So, 'Adapter (90)' = 90 target submodules were wrapped with LoRA.\n\n"
+                        "Would you like to print them all? Here's how:\n\n"
+                        "```python\n"
+                        "for name, module in model.named_modules():\n"
+                        "    if 'lora' in name.lower():\n"
+                        "        print(name)\n"
+                        "```\n"
+                    )
+                check_btn.click(show_trainable_layers, inputs=[base_model_name], outputs=[param_output])
+            # -------------------------------
+            # 4️⃣ CODE DEBUG TAB
+            # -------------------------------
+            with gr.Tab("Code Debug"):
+                gr.Markdown("### 🧩 Code Debug — Understand What's Happening Line by Line")
+                debug_md = gr.Markdown(
+                    """
+#### 🧰 Step-by-Step Breakdown
+Below shows what each major step does internally during training:
+1. **`f"[INFO] Loading base model: {base_model}"`**
+   → Logs which model is being loaded (e.g., `google/gemma-2b-it`)
+2. **`AutoModelForCausalLM.from_pretrained(base_model)`**
+   → Downloads the base Gemma model weights and tokenizer.
+3. **`get_peft_model(model, config)`**
+   → Wraps the model with LoRA. Injects adapters into `q_proj`, `k_proj`, `v_proj`, etc.
+4. **Expected console output:**
+[INFO] Loading base model: google/gemma-2b-it
+[INFO] Preparing dataset...
+[INFO] Injecting LoRA adapters...
+trainable params: 3.5M || all params: 270M || trainable%: 1.3%
+5. **`trainer.train()`**
+→ Starts training loop, showing tqdm progress bars per epoch.
+6. **`upload_file(...)`**
+→ Uploads all model files to your chosen HF repo (if specified).
 ---
+#### 🔍 What “Adapter (90)” Means
+When you initialize LoRA on Gemma, it finds **90 target layers** that match
+typical names like:
+- `q_proj`, `k_proj`, `v_proj`
+- `o_proj`
+- `gate_proj`, `up_proj`, `down_proj`
+Each layer gets small trainable matrices **(A, B)** injected.
+Hence you see:
+> **Adapter (90)** → *90 modules modified by LoRA.*
+You can list them in your own model like this:
 ```python
+for name, module in model.named_modules():
+ if "lora" in name.lower():
+     print(name)
+                """
+            )
     return demo