Spaces:
Sleeping
Sleeping
Commit
·
ccafeb2
1
Parent(s):
024afd4
readme + final touches
Browse files- README.md +56 -1
- frontend/src/App.jsx +29 -0
README.md
CHANGED
|
@@ -7,4 +7,59 @@ sdk: docker
|
|
| 7 |
app_port: 7860
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
app_port: 7860
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# Evolution Transformer: An Interactive Playground for LLM Model Merging
|
| 11 |
+
|
| 12 |
+
An interactive web application for exploring model merging techniques for large language models. This project allows users to dynamically create new "child" models by combining pre-trained specialists, based on the concepts from the research paper [Evolutionary Optimization of Model Merging Recipes](https://arxiv.org/pdf/2403.13187).
|
| 13 |
+
|
| 14 |
+
Live Demo: [https://evolutiontransformer.michaelbao.com](https://evolutiontransformer.michaelbao.com)
|
| 15 |
+
Backend API: [Hugging Face Space](https://tcmmichaelb139-evolutiontransformer.hf.space)
|
| 16 |
+
|
| 17 |
+
## Features
|
| 18 |
+
|
| 19 |
+
- Dynamic Model Merging: Create new models with more or fewer layers than the original parents by defining a recipe of any length.
|
| 20 |
+
- Full Model Control: In addition to the main transformer blocks, users can also control the blend ratios for the embedding and final output layers.
|
| 21 |
+
- Interactive Interface: User-friendly web interface built with React and Tailwind CSS for easy model selection and configuration.
|
| 22 |
+
- Asynchronous Processing: Efficient task handling using Celery and Redis for background processing of model merging.
|
| 23 |
+
|
| 24 |
+
## Architecture
|
| 25 |
+
|
| 26 |
+
The application is built on a modern, decoupled, multi-service architecture designed for scalable and robust machine learning deployment. The backend is running on a CPU instead of a GPU to save costs, however to run GPT2-medium (which we are using) it is decent.
|
| 27 |
+
|
| 28 |
+
[React Frontend @ Cloudflare] <--> [FastAPI Web Server @ HF Spaces] <--> [Redis Queue @ Upstash] <--> [Celery GPU Worker @ HF Spaces]
|
| 29 |
+
|
| 30 |
+
## Tech Stack
|
| 31 |
+
|
| 32 |
+
- Frontend: React (Vite), Tailwind CSS
|
| 33 |
+
- Backend: FastAPI, PyTorch/Hugging Face Transformers, Celery, Redis, uv (package manager)
|
| 34 |
+
- Deployment: Cloudflare Pages (Frontend), Hugging Face Spaces (Backend and Worker), Upstash (Redis)
|
| 35 |
+
|
| 36 |
+
## Setup Instructions
|
| 37 |
+
|
| 38 |
+
You need to run four separate processes in four different terminal tabs. You may need to change some link variables in the code to point to your own deployment URLs.
|
| 39 |
+
|
| 40 |
+
**1. Start Redis (if not already running as a service):**
|
| 41 |
+
|
| 42 |
+
```bash
|
| 43 |
+
redis-server
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
**2. Start the Celery Worker:**
|
| 47 |
+
|
| 48 |
+
```bash
|
| 49 |
+
# In your project root, with .venv active
|
| 50 |
+
celery -A evolutiontransformer.worker.celery_app worker --loglevel=info -c 1
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
**3. Start the FastAPI Server:**
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
# In your project root, with .venv active
|
| 57 |
+
uvicorn evolutiontransformer.api:app --reload
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
**4. Start the React Frontend:**
|
| 61 |
+
|
| 62 |
+
```bash
|
| 63 |
+
# In the /frontend directory
|
| 64 |
+
npm run dev
|
| 65 |
+
```
|
frontend/src/App.jsx
CHANGED
|
@@ -49,6 +49,35 @@ function App() {
|
|
| 49 |
|
| 50 |
return (
|
| 51 |
<div className="h-screen bg-gradient-to-br from-primary-50 to-secondary-50 overflow-hidden">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
<Options
|
| 53 |
models={models}
|
| 54 |
selectedModel1={selectedModel1}
|
|
|
|
| 49 |
|
| 50 |
return (
|
| 51 |
<div className="h-screen bg-gradient-to-br from-primary-50 to-secondary-50 overflow-hidden">
|
| 52 |
+
{/* GitHub Corner */}
|
| 53 |
+
<a
|
| 54 |
+
href="https://github.com/tcmmichaelb139/evolutiontransformer"
|
| 55 |
+
className="github-corner group absolute top-0 right-0 z-50"
|
| 56 |
+
aria-label="View source on GitHub"
|
| 57 |
+
target="_blank"
|
| 58 |
+
rel="noopener noreferrer"
|
| 59 |
+
>
|
| 60 |
+
<svg
|
| 61 |
+
width="40"
|
| 62 |
+
height="40"
|
| 63 |
+
viewBox="0 0 250 250"
|
| 64 |
+
className="fill-gray-800 text-white absolute top-0 right-0 border-0"
|
| 65 |
+
aria-hidden="true"
|
| 66 |
+
>
|
| 67 |
+
<path d="M0,0 L115,115 L130,115 L142,142 L250,250 L250,0 Z" />
|
| 68 |
+
<path
|
| 69 |
+
d="M128.3,109.0 C113.8,99.7 119.0,89.6 119.0,89.6 C122.0,82.7 120.5,78.6 120.5,78.6 C119.2,72.0 123.4,76.3 123.4,76.3 C127.3,80.9 125.5,87.3 125.5,87.3 C122.9,97.6 130.6,101.9 134.4,103.2"
|
| 70 |
+
fill="currentColor"
|
| 71 |
+
className="group-hover:animate-pulse origin-[130px_106px] transition-transform duration-300"
|
| 72 |
+
/>
|
| 73 |
+
<path
|
| 74 |
+
d="M115.0,115.0 C114.9,115.1 118.7,116.5 119.8,115.4 L133.7,101.6 C136.9,99.2 139.9,98.4 142.2,98.6 C133.8,88.0 127.5,74.4 143.8,58.0 C148.5,53.4 154.0,51.2 159.7,51.0 C160.3,49.4 163.2,43.6 171.4,40.1 C171.4,40.1 176.1,42.5 178.8,56.2 C183.1,58.6 187.2,61.8 190.9,65.4 C194.5,69.0 197.7,73.2 200.1,77.6 C213.8,80.2 216.3,84.9 216.3,84.9 C212.7,93.1 206.9,96.0 205.4,96.6 C205.1,102.4 203.0,107.8 198.3,112.5 C181.9,128.9 168.3,122.5 157.7,114.1 C157.9,116.9 156.7,120.9 152.7,124.9 L141.0,136.5 C139.8,137.7 141.6,141.9 141.8,141.8 Z"
|
| 75 |
+
fill="currentColor"
|
| 76 |
+
className="octo-body"
|
| 77 |
+
/>
|
| 78 |
+
</svg>
|
| 79 |
+
</a>
|
| 80 |
+
|
| 81 |
<Options
|
| 82 |
models={models}
|
| 83 |
selectedModel1={selectedModel1}
|