fnmodel / README.md
aeb56
Initial commit: LoRA model merger
a951334
|
raw
history blame
1.92 kB
---
title: LoRA Model Merger
emoji: πŸ”—
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
---
# πŸ”— LoRA Model Merger
A Hugging Face Space for merging fine-tuned LoRA adapters with base models.
## Overview
This Space provides an easy-to-use interface for merging LoRA (Low-Rank Adaptation) fine-tuned models with their base models. Specifically designed for:
- **Base Model:** `moonshotai/Kimi-Linear-48B-A3B-Instruct`
- **LoRA Adapters:** `Optivise/kimi-linear-48b-a3b-instruct-qlora-fine-tuned`
## Features
βœ… **Easy Model Merging** - Simple UI to merge LoRA adapters with base model
βœ… **Built-in Testing** - Test your merged model with custom prompts
βœ… **Hub Integration** - Upload merged models directly to Hugging Face Hub
βœ… **GPU Optimized** - Designed for 4xL40S GPU setup
## Usage
1. **Merge Models**: Provide your Hugging Face token and click "Start Merge Process"
2. **Test Inference**: Test the merged model with sample prompts
3. **Upload to Hub**: Optionally upload the merged model to your Hugging Face account
## Requirements
- **Hardware:** 4x NVIDIA L40S GPUs (or equivalent with ~192GB VRAM)
- **Software:** Docker, CUDA 12.1+
- **Access:** Valid Hugging Face token for model access
## Technical Details
The merge process:
1. Downloads the base model (~48B parameters)
2. Loads LoRA adapter weights
3. Merges adapters into base model using PEFT
4. Saves the unified model for inference
## Notes
- Merge process can take 10-30 minutes depending on network speed
- Merged model will be approximately the same size as the base model
- Ensure you have appropriate access rights to both base and LoRA models
## Support
For issues or questions:
- [PEFT Documentation](https://huggingface.co/docs/peft)
- [Transformers Documentation](https://huggingface.co/docs/transformers)
---
Built with ❀️ using Transformers, PEFT, and Gradio