File size: 5,093 Bytes
d3f86d8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Development Commands

### Quick Start (Recommended)
```bash
./run_dev.sh
```
This single script starts both frontend (port 11111) and backend (port 11110) with auto-browser opening.

### Manual Development Setup

**Frontend (Svelte):**
```bash
cd frontend
npm install
npm run dev          # Development server on :11111
npm run build        # Production build
npm run check        # Type checking
```

**Backend (FastAPI):**
```bash
cd backend
pip install -r requirements.txt
python -m hfstudio.cli --dev  # Development server on :11110
# OR
pip install -e .
hfstudio --dev
```

### Port Configuration
- Frontend: 11111 (configured in `vite.config.js`)
- Backend: 7860 (configured in `cli.py`)
- CORS is configured for these specific ports in `server.py`

## Architecture Overview

### Frontend Structure (SvelteKit + TailwindCSS)
- **Single Page App**: Main interface in `src/routes/+page.svelte`
- **Layout**: Global layout with sidebar in `src/routes/+layout.svelte`
- **Design System**: HuggingFace brand colors (`#FFD21E`, `#FF9D00`) used throughout
- **State Management**: Local component state with reactive variables
- **Audio Handling**: Custom HTML5 audio element with manual progress tracking

### Key UI Components
- **Three-panel layout**: Sidebar (56 units) + Main content + Settings panel (80 units)
- **Fixed bottom button**: Generate button positioned absolutely at page bottom
- **Mini audio player**: Compact controls in generated audio card
- **Full audio player**: Expanded controls with ElevenLabs-style design
- **Custom pause icon**: CSS-only filled bars instead of outline

### Backend Structure (FastAPI)
- **Main server**: `server.py` with CORS configured for development ports
- **CLI interface**: `cli.py` with typer for command-line control
- **Pydantic models**: TTSRequest, TTSResponse, Voice, Model
- **Current implementation**: Mock TTS generation using placeholder audio

### API Endpoints
```
GET  /                     - Health check
GET  /api/status          - Mode and availability info
GET  /api/voices          - Available voice list
GET  /api/models          - Available model list  
POST /api/tts/generate    - Generate speech from text
```

## Design Patterns & Conventions

### Frontend Patterns
- **HF Brand Integration**: Uses official logo (`/assets/hf-logo.png`) and gradient colors
- **Responsive Controls**: All sliders use custom `.slider-hf` class with HF colors
- **Audio State Management**: Manual synchronization between UI state and HTML5 audio element
- **Progressive Enhancement**: Settings always visible, no hidden toggles

### Backend Patterns
- **Development Mode**: Auto-reload enabled with `--dev` flag
- **Mock Implementation**: Currently returns `/samples/harvard.wav` for testing
- **CORS Configuration**: Explicitly configured for development ports

### Styling Conventions
- **TailwindCSS**: Primary styling framework
- **Custom CSS**: Limited to audio sliders and pause icon in `app.css`
- **Color Scheme**: Light theme with HF amber/orange accents
- **Typography**: System fonts with careful spacing

## Key Implementation Details

### Audio System
The app uses a hidden HTML5 `<audio>` element controlled by custom UI:
- Real audio playback through bound `audioElement`
- Manual progress tracking via `timeupdate` events  
- Auto-play when generation completes
- Custom pause icon using CSS pseudo-elements

### Voice & Model Data
```javascript
// Current models (in +page.svelte)
const models = [
  { id: 'chatterbox', name: 'Chatterbox', badge: 'recommended' },
  { id: 'kokoro', name: 'Kokoro', badge: 'faster but lower quality' }
];

// Current voices with descriptions
const voices = [
  { id: 'novia', name: 'Novia', description: 'Warm, conversational voice' },
  // ... etc
];
```

### Configuration Files
- **Frontend env**: `.env` with `PUBLIC_API_URL=http://localhost:11110`
- **Vite config**: Custom port (11111) and host settings
- **TailwindCSS**: Custom colors and slider styling
- **Backend requirements**: FastAPI, numpy, soundfile for audio processing

### Static Assets
- **Logo**: HuggingFace logo at `/assets/hf-logo.png` 
- **Sample audio**: Harvard sample at `/samples/harvard.wav` for testing
- **Favicon**: Uses HF logo

## Development Workflow

### Making UI Changes
1. Edit components in `frontend/src/routes/+page.svelte`
2. Layout changes go in `frontend/src/routes/+layout.svelte`  
3. Global styles in `frontend/src/app.css`
4. Hot reload shows changes instantly

### Adding API Features  
1. Add Pydantic models in `server.py`
2. Create new endpoints following existing patterns
3. Update CORS origins if needed
4. Test at `http://localhost:11110/docs`

### Troubleshooting
- **Port conflicts**: Change ports in `vite.config.js` and `cli.py`
- **CORS issues**: Update `allow_origins` in `server.py`
- **Audio not playing**: Check audio file exists at `/samples/harvard.wav`
- **Dependencies**: Run `npm install` in frontend, `pip install -r requirements.txt` in backend