Spaces:

rahul7star
/

Train-Lora

Running

App Files Files Community

rahul7star commited on Nov 10, 2025

Commit

6895c9e

verified ·

1 Parent(s): d6b8b1b

Update app_gpu.py

Browse files

Files changed (1) hide show

app_gpu.py +82 -17

app_gpu.py CHANGED Viewed

@@ -409,25 +409,90 @@ for name, module in model.named_modules():
              # 5️⃣ CODE EXPLAIN TAB
              # =========================================================
             with gr.Tab("Code Explain"):
-              gr.Markdown("""
-Universal Dynamic LoRA Trainer & Inference — Code Explanation
-This interface supports training, inference, and inspection of LoRA-enhanced models like Gemma 3B/4B.
-Train LoRA Tab: Configure and fine-tune LoRA layers.
-Inference (CPU): Expand short prompts to long prompts without GPU.
-Show Trainable Params: See LoRA layer mapping (e.g., “Adapter (90)”).
-Code Debug: Visual breakdown of what happens during training.
 print(f"[INFO] Loading base model: {base_model}")
-# -> Loads base model
-print("[INFO] Injecting LoRA adapters...")
-# -> Injects 90 LoRA modules
-print("[TRAIN] Step 100 | Loss: 0.91")
 print("[INFO] Saving LoRA adapter...")
-# -> Uploads to Hugging Face Hub
 # [SUCCESS] LoRA uploaded successfully 🚀
 """)
     return demo

              # 5️⃣ CODE EXPLAIN TAB
              # =========================================================
             with gr.Tab("Code Explain"):
+                explain_md = gr.Markdown("""
+### 🧩 Universal Dynamic LoRA Trainer & Inference — Code Explanation
+This project provides an **end-to-end LoRA fine-tuning and inference system** for language models like **Gemma**, built with **Gradio**, **PEFT**, and **Accelerate**.
+It supports both **training new LoRAs** and **generating text** with existing ones — all in a single interface.
+---
+#### **1️⃣ Imports Overview**
+- **Core libs:** `os`, `torch`, `gradio`, `numpy`, `pandas`
+- **Training libs:** `peft` (`LoraConfig`, `get_peft_model`), `accelerate` (`Accelerator`)
+- **Modeling:** `transformers` (for Gemma base model)
+- **Hub integration:** `huggingface_hub` (for uploading adapters)
+- **Spaces:** `spaces` — for execution within Hugging Face Spaces
+---
+#### **2️⃣ Dataset Loading**
+- Uses a lightweight **MediaTextDataset** class to load:
+  - CSV / Parquet files
+  - or directly from a Hugging Face dataset repo
+- Expects two columns:
+  `short_prompt` → Input text
+  `long_prompt` → Target expanded text
+- Supports batching, missing-column checks, and configurable max record limits.
+---
+#### **3️⃣ Model Loading & Preparation**
+- Loads **Gemma model and tokenizer** via `AutoModelForCausalLM` and `AutoTokenizer`.
+- Automatically detects **target modules** (e.g. `q_proj`, `v_proj`) for LoRA injection.
+- Supports `float16` or `bfloat16` precision with `Accelerator` for optimal memory usage.
+---
+#### **4️⃣ LoRA Training Logic**
+- Core formula:
+  \[
+  W_{eff} = W + \alpha \times (B @ A)
+  \]
+- Only **A** and **B** matrices are trainable; base model weights remain frozen.
+- Configurable parameters:
+  `r` (rank), `alpha` (scaling), `epochs`, `lr`, `batch_size`
+- Training logs stream live in the UI, showing step-by-step loss values.
+- After training, the adapter is **saved locally** and **uploaded to Hugging Face Hub**.
+---
+#### **5️⃣ CPU Inference Mode**
+- Runs entirely on **CPU**, no GPU required.
+- Loads base Gemma model + trained LoRA weights (`PeftModel.from_pretrained`).
+- Optionally merges LoRA with base model.
+- Expands the short prompt → long descriptive text using standard generation parameters (e.g., top-p / top-k sampling).
+---
+#### **6️⃣ LoRA Internals Explained**
+- LoRA injects low-rank matrices (A, B) into **attention Linear layers**.
+- Example:
+  \[
+  Q_{new} = Q + \alpha \times (B @ A)
+  \]
+- Significantly reduces training cost:
+  - Memory: ~1–2% of full model
+  - Compute: trains faster with minimal GPU load
+- Scalable to large models like Gemma 3B / 4B with rank ≤ 16.
+---
+#### **7️⃣ Gradio UI Structure**
+- **Train LoRA Tab:**
+  Configure model, dataset, LoRA parameters, and upload target.
+  Press **🚀 Start Training** to stream training logs live.
+- **Inference (CPU) Tab:**
+  Type a short prompt → Generates expanded long-form version via trained LoRA.
+- **Code Explain Tab:**
+  Detailed breakdown of logic + simulated console output below.
+---
+### 🧾 Example Log Simulation
+```python
 print(f"[INFO] Loading base model: {base_model}")
+# -> Loads Gemma base model (fp16) on CUDA
+# [INFO] Base model google/gemma-3-4b-it loaded successfully
+print(f"[INFO] Preparing dataset from: {dataset_path}")
+# -> Loads dataset or CSV file
+# [DATA] 980 samples loaded, columns: short_prompt, long_prompt
+print("[INFO] Initializing LoRA configuration...")
+# -> Creates LoraConfig(r=8, alpha=16, target_modules=['q_proj', 'v_proj'])
+# [CONFIG] LoRA applied to 96 attention layers
+print("[INFO] Starting training loop...")
+# [TRAIN] Step 1 | Loss: 2.31
+# [TRAIN] Step 50 | Loss: 1.42
+# [TRAIN] Step 100 | Loss: 0.91
+# [TRAIN] Epoch 1 complete (avg loss: 1.21)
 print("[INFO] Saving LoRA adapter...")
+# -> Saves safetensors and config locally
+print(f"[UPLOAD] Pushing adapter to {hf_repo_id}")
+# -> Uploads model to Hugging Face Hub
+# [UPLOAD] adapter_model.safetensors (67.7 MB)
 # [SUCCESS] LoRA uploaded successfully 🚀
 """)
     return demo