File size: 1,917 Bytes
7a80ad4
a951334
 
 
 
7a80ad4
 
a951334
 
7a80ad4
 
a951334
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
title: LoRA Model Merger
emoji: πŸ”—
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
---

# πŸ”— LoRA Model Merger

A Hugging Face Space for merging fine-tuned LoRA adapters with base models.

## Overview

This Space provides an easy-to-use interface for merging LoRA (Low-Rank Adaptation) fine-tuned models with their base models. Specifically designed for:

- **Base Model:** `moonshotai/Kimi-Linear-48B-A3B-Instruct`
- **LoRA Adapters:** `Optivise/kimi-linear-48b-a3b-instruct-qlora-fine-tuned`

## Features

βœ… **Easy Model Merging** - Simple UI to merge LoRA adapters with base model
βœ… **Built-in Testing** - Test your merged model with custom prompts
βœ… **Hub Integration** - Upload merged models directly to Hugging Face Hub
βœ… **GPU Optimized** - Designed for 4xL40S GPU setup

## Usage

1. **Merge Models**: Provide your Hugging Face token and click "Start Merge Process"
2. **Test Inference**: Test the merged model with sample prompts
3. **Upload to Hub**: Optionally upload the merged model to your Hugging Face account

## Requirements

- **Hardware:** 4x NVIDIA L40S GPUs (or equivalent with ~192GB VRAM)
- **Software:** Docker, CUDA 12.1+
- **Access:** Valid Hugging Face token for model access

## Technical Details

The merge process:
1. Downloads the base model (~48B parameters)
2. Loads LoRA adapter weights
3. Merges adapters into base model using PEFT
4. Saves the unified model for inference

## Notes

- Merge process can take 10-30 minutes depending on network speed
- Merged model will be approximately the same size as the base model
- Ensure you have appropriate access rights to both base and LoRA models

## Support

For issues or questions:
- [PEFT Documentation](https://huggingface.co/docs/peft)
- [Transformers Documentation](https://huggingface.co/docs/transformers)

---

Built with ❀️ using Transformers, PEFT, and Gradio