File size: 4,438 Bytes
7a87926
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# API Endpoints - Optimization Parameters Wired Up

All optimization parameters are now exposed through the API endpoints.

## βœ… Updated Endpoints

### 1. `/train/start` (Fine-tuning)

**Request Model**: `TrainRequest`

**New Optimization Parameters**:

- `gradient_accumulation_steps` (int, default: 1) - Gradient accumulation
- `use_amp` (bool, default: True) - Mixed precision training
- `warmup_steps` (int, default: 0) - Learning rate warmup
- `num_workers` (Optional[int], default: None) - Data loading workers
- `resume_from_checkpoint` (Optional[str], default: None) - Resume training
- `use_ema` (bool, default: False) - Exponential Moving Average
- `ema_decay` (float, default: 0.9999) - EMA decay factor
- `use_onecycle` (bool, default: False) - OneCycleLR scheduler
- `use_gradient_checkpointing` (bool, default: False) - Memory-efficient training
- `compile_model` (bool, default: True) - Torch.compile optimization

**Example Request**:

```json
{
  "training_data_dir": "data/training",
  "epochs": 10,
  "lr": 1e-5,
  "batch_size": 1,
  "use_amp": true,
  "gradient_accumulation_steps": 4,
  "use_ema": true,
  "use_onecycle": true,
  "compile_model": true
}
```

### 2. `/train/pretrain` (Pre-training)

**Request Model**: `PretrainRequest`

**New Optimization Parameters**:

- All the same as `/train/start` plus:
- `cache_dir` (Optional[str], default: None) - BA result caching directory

**Example Request**:

```json
{
  "arkit_sequences_dir": "data/arkit_sequences",
  "epochs": 10,
  "lr": 1e-4,
  "use_amp": true,
  "use_ema": true,
  "use_onecycle": true,
  "cache_dir": "cache/ba_results",
  "compile_model": true
}
```

### 3. `/dataset/build` (Dataset Building)

**Request Model**: `BuildDatasetRequest`

**New Optimization Parameters**:

- `use_batched_inference` (bool, default: False) - Batch multiple sequences
- `inference_batch_size` (int, default: 4) - Batch size for inference
- `use_inference_cache` (bool, default: False) - Cache inference results
- `cache_dir` (Optional[str], default: None) - Inference cache directory
- `compile_model` (bool, default: True) - Torch.compile for inference

**Example Request**:

```json
{
  "sequences_dir": "data/sequences",
  "output_dir": "data/training",
  "use_batched_inference": true,
  "inference_batch_size": 4,
  "use_inference_cache": true,
  "cache_dir": "cache/inference",
  "compile_model": true
}
```

## πŸ”„ Data Flow

```
API Request (JSON)
    ↓
Request Model (Pydantic validation)
    ↓
Router Endpoint (training.py)
    ↓
CLI Function (cli.py) - passes through all params
    ↓
Service Function (fine_tune.py / pretrain.py / data_pipeline.py)
    ↓
Optimized Training/Inference
```

## πŸ“ Files Updated

1. **`ylff/models/api_models.py`**

   - Added optimization fields to `TrainRequest`
   - Added optimization fields to `PretrainRequest`
   - Added optimization fields to `BuildDatasetRequest`

2. **`ylff/routers/training.py`**

   - Updated `/train/start` to pass optimization params
   - Updated `/train/pretrain` to pass optimization params
   - Updated `/dataset/build` to pass optimization params

3. **`ylff/cli.py`**
   - Updated `train()` CLI function to accept optimization params
   - Updated `pretrain()` CLI function to accept optimization params
   - Updated `build_dataset()` CLI function to accept optimization params
   - All params are passed through to service functions

## 🎯 Usage Examples

### Fast Training via API

```bash
curl -X POST "http://localhost:8000/api/v1/train/start" \
  -H "Content-Type: application/json" \
  -d '{
    "training_data_dir": "data/training",
    "epochs": 10,
    "use_amp": true,
    "gradient_accumulation_steps": 4,
    "use_ema": true,
    "use_onecycle": true,
    "compile_model": true
  }'
```

### Optimized Dataset Building

```bash
curl -X POST "http://localhost:8000/api/v1/dataset/build" \
  -H "Content-Type: application/json" \
  -d '{
    "sequences_dir": "data/sequences",
    "use_batched_inference": true,
    "inference_batch_size": 4,
    "use_inference_cache": true,
    "cache_dir": "cache/inference"
  }'
```

## βœ… Status

All optimization parameters are:

- βœ… Defined in API request models
- βœ… Validated by Pydantic
- βœ… Passed through router endpoints
- βœ… Accepted by CLI functions
- βœ… Forwarded to service functions
- βœ… Documented with descriptions and examples

The API is fully wired up to use all optimization capabilities! πŸš€