---
title: LoRA Model Merger
emoji: 🔗
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
---

# 🔗 LoRA Model Merger

A Hugging Face Space for merging fine-tuned LoRA adapters with base models.

## Overview

This Space provides an easy-to-use interface for merging LoRA (Low-Rank Adaptation) fine-tuned models with their base models. Specifically designed for:

- **Base Model:** `moonshotai/Kimi-Linear-48B-A3B-Instruct`
- **LoRA Adapters:** `Optivise/kimi-linear-48b-a3b-instruct-qlora-fine-tuned`

## Features

✅ **Easy Model Merging** - Simple UI to merge LoRA adapters with base model
✅ **Built-in Testing** - Test your merged model with custom prompts
✅ **Hub Integration** - Upload merged models directly to Hugging Face Hub
✅ **GPU Optimized** - Designed for 4xL40S GPU setup

## Usage

1. **Merge Models**: Provide your Hugging Face token and click "Start Merge Process"
2. **Test Inference**: Test the merged model with sample prompts
3. **Upload to Hub**: Optionally upload the merged model to your Hugging Face account

## Requirements

- **Hardware:** 4x NVIDIA L40S GPUs (or equivalent with ~192GB VRAM)
- **Software:** Docker, CUDA 12.1+
- **Access:** Valid Hugging Face token for model access

## Technical Details

The merge process:
1. Downloads the base model (~48B parameters)
2. Loads LoRA adapter weights
3. Merges adapters into base model using PEFT
4. Saves the unified model for inference

## Notes

- Merge process can take 10-30 minutes depending on network speed
- Merged model will be approximately the same size as the base model
- Ensure you have appropriate access rights to both base and LoRA models

## Support

For issues or questions:
- [PEFT Documentation](https://huggingface.co/docs/peft)
- [Transformers Documentation](https://huggingface.co/docs/transformers)

---

Built with ❤️ using Transformers, PEFT, and Gradio