Spaces:

rahul7star
/

Train-Lora

Running

App Files Files Community

rahul7star commited on Nov 9, 2025

Commit

c05048d

verified ·

1 Parent(s): d42c09b

Update app_gpu.py

Browse files

Files changed (1) hide show

app_gpu.py +64 -1

app_gpu.py CHANGED Viewed

@@ -300,7 +300,70 @@ def run_ui():
                           inputs=[inf_base_model, inf_lora_repo, short_prompt],
                           outputs=[long_prompt_out])
-    return demo
 if __name__ == "__main__":
     run_ui().launch(server_name="0.0.0.0", server_port=7860, share=True)

                           inputs=[inf_base_model, inf_lora_repo, short_prompt],
                           outputs=[long_prompt_out])
+        # ---------------- Code Explain Tab ----------------
+        with gr.Tab("Code Explain"):
+            explain_md = gr.Markdown("""
+### Universal LoRA Trainer & Inference - Code Explanation
+#### 1. Imports
+- **spaces, os, torch, gradio, pandas, numpy**: General utilities, tensor ops, UI, and data handling.
+- **peft (LoraConfig, get_peft_model)**: Handles LoRA adapters and integration into base model.
+- **accelerate (Accelerator)**: Simplifies device placement, mixed precision, and distributed training.
+- **huggingface_hub**: Upload LoRA weights to HF Hub.
+- **transformers (optional)**: Used if base model is a Hugging Face LLM (Gemma).
+#### 2. Dataset
+- **MediaTextDataset**: Loads CSV/Parquet or HF dataset, extracts `short_prompt` and `long_prompt`.
+- Handles batched access and fallback for missing columns.
+#### 3. Model Loading
+- `load_pipeline_auto`: Loads Gemma tokenizer + model in float16/32 depending on device.
+- `find_target_modules`: Detects which Linear layers to apply LoRA (Q/K/V projections).
+#### 4. LoRA Training (`train_lora_stream`)
+W_eff = W + alpha * B @ A
+- **LoRA Config**:
+- `r` is low-rank dimension.
+- `alpha` scales LoRA updates.
+- Targets Q/K/V or other Linear layers in attention.
+- **Training**:
+- Dataset is wrapped in DataLoader.
+- LoRA module + optimizer prepared with Accelerator.
+- Forward pass computes loss (cross-entropy).
+- Backprop applied only to LoRA parameters (efficient).
+- Logs streamed for each step.
+- **Upload**: Saves LoRA and pushes to HF Hub.
+#### 5. CPU Inference (`generate_long_prompt_cpu`)
+- Loads base Gemma model in CPU (float32).
+- Loads LoRA weights with `PeftModel.from_pretrained`.
+- Optionally merges LoRA into base to simplify runtime.
+- Tokenizes short prompt and generates expanded prompt using `generate()` with top-p/top-k sampling.
+#### 6. LoRA Internals
+- LoRA injects trainable matrices `A` and `B` into selected Linear layers (usually Q/K/V in attention):
+- `Query, Key, Value (Q/K/V)` are used in attention:
+  ```
+  Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) V
+  ```
+- LoRA updates `Q/K/V` with `alpha * B @ A`, keeping main model frozen.
+- Efficient: only small low-rank matrices are trained (`r << hidden_size`), reducing memory & compute.
+- Other modules LoRA can target: `out_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`.
+#### 7. Gradio UI
+- **Train Tab**: User inputs for model, dataset, LoRA params, and HF repo.
+- **Inference Tab**: Short prompt → expanded long prompt using LoRA on CPU.
+- **Code Explain Tab**: Interactive Markdown explaining code logic & LoRA internals.
+""")
+          explain_md.render()
+  return demo
 if __name__ == "__main__":
     run_ui().launch(server_name="0.0.0.0", server_port=7860, share=True)