--- title: Nanbeige2.5 — Chat emoji: 🦙 colorFrom: indigo colorTo: purple sdk: gradio sdk_version: "3.50.0" python_version: "3.10" app_file: app.py pinned: false --- # Nanbeige2.5 — Gradio Chat Space Lightweight Gradio chat UI for `PioTio/Nanbeige2.5` suitable for deployment on Hugging Face Spaces. ## Features - Streaming and non-streaming generation ✅ - Tokenizer ↔ model sanity fixes (avoids SentencePiece `piece id` errors) ✅ - 4-bit BitsAndBytes load when GPU + `bitsandbytes` are available ✅ - Optional LoRA adapter application (requires `peft`) ✅ - Controls: temperature, top-p, top-k, max tokens, max-history ✅ **Quick CPU tip:** This Space **may be CPU-only**. Full `PioTio/Nanbeige2.5` on CPU is extremely slow — use the **Load fast CPU demo (distilgpt2)** button for quick responses, enable GPU in Space settings for production use, or check **Force CPU generation** to run Nanbeige on CPU (very slow and not recommended). ## Deployment (Hugging Face Spaces) 1. Create a new Space (Gradio runtime). 2. Upload these files (`app.py`, `requirements.txt`, `README.md`). 3. In Space settings choose **Hardware accelerator: GPU** (recommended). After pushing these files the Space will build automatically — open the Space page and monitor logs for errors. If you prefer, you can create the `app.py` directly in the web UI instead of pushing from Git. **Tip:** keep `bitsandbytes` in `requirements.txt` if you plan to enable 4-bit loading on GPU; remove or pin it if the build log shows dependency issues. - If you see `piece id is out of range` errors the app will attempt to auto-fix tokenizer/model alignment. - To apply a LoRA adapter after starting the app, paste the adapter HF repo in the LoRA field and click **Apply LoRA adapter**. ## Recommended hardware - GPU (T4 / A10 / A100) for real-time streaming; CPU-only may be slow for inference. ## Troubleshooting - If model load fails on Spaces, check the logs for memory OOM; switch to GPU or enable `bitsandbytes` 4-bit. - For adapter load failures, ensure adapter repo exists and `peft` is present in `requirements.txt`. ---