--- title: LoRA Model Merger emoji: 🔗 colorFrom: blue colorTo: purple sdk: docker pinned: false license: apache-2.0 app_port: 7860 --- # 🔗 LoRA Model Merger A Hugging Face Space for merging fine-tuned LoRA adapters with base models. ## Overview This Space provides an easy-to-use interface for merging LoRA (Low-Rank Adaptation) fine-tuned models with their base models. Specifically designed for: - **Base Model:** `moonshotai/Kimi-Linear-48B-A3B-Instruct` - **LoRA Adapters:** `Optivise/kimi-linear-48b-a3b-instruct-qlora-fine-tuned` ## Features ✅ **Easy Model Merging** - Simple UI to merge LoRA adapters with base model ✅ **Built-in Testing** - Test your merged model with custom prompts ✅ **Hub Integration** - Upload merged models directly to Hugging Face Hub ✅ **GPU Optimized** - Designed for 4xL40S GPU setup ## Usage 1. **Merge Models**: Provide your Hugging Face token and click "Start Merge Process" 2. **Test Inference**: Test the merged model with sample prompts 3. **Upload to Hub**: Optionally upload the merged model to your Hugging Face account ## Requirements - **Hardware:** 4x NVIDIA L40S GPUs (or equivalent with ~192GB VRAM) - **Software:** Docker, CUDA 12.1+ - **Access:** Valid Hugging Face token for model access ## Technical Details The merge process: 1. Downloads the base model (~48B parameters) 2. Loads LoRA adapter weights 3. Merges adapters into base model using PEFT 4. Saves the unified model for inference ## Notes - Merge process can take 10-30 minutes depending on network speed - Merged model will be approximately the same size as the base model - Ensure you have appropriate access rights to both base and LoRA models ## Support For issues or questions: - [PEFT Documentation](https://huggingface.co/docs/peft) - [Transformers Documentation](https://huggingface.co/docs/transformers) --- Built with ❤️ using Transformers, PEFT, and Gradio