| # API Endpoints - Optimization Parameters Wired Up | |
| All optimization parameters are now exposed through the API endpoints. | |
| ## β Updated Endpoints | |
| ### 1. `/train/start` (Fine-tuning) | |
| **Request Model**: `TrainRequest` | |
| **New Optimization Parameters**: | |
| - `gradient_accumulation_steps` (int, default: 1) - Gradient accumulation | |
| - `use_amp` (bool, default: True) - Mixed precision training | |
| - `warmup_steps` (int, default: 0) - Learning rate warmup | |
| - `num_workers` (Optional[int], default: None) - Data loading workers | |
| - `resume_from_checkpoint` (Optional[str], default: None) - Resume training | |
| - `use_ema` (bool, default: False) - Exponential Moving Average | |
| - `ema_decay` (float, default: 0.9999) - EMA decay factor | |
| - `use_onecycle` (bool, default: False) - OneCycleLR scheduler | |
| - `use_gradient_checkpointing` (bool, default: False) - Memory-efficient training | |
| - `compile_model` (bool, default: True) - Torch.compile optimization | |
| **Example Request**: | |
| ```json | |
| { | |
| "training_data_dir": "data/training", | |
| "epochs": 10, | |
| "lr": 1e-5, | |
| "batch_size": 1, | |
| "use_amp": true, | |
| "gradient_accumulation_steps": 4, | |
| "use_ema": true, | |
| "use_onecycle": true, | |
| "compile_model": true | |
| } | |
| ``` | |
| ### 2. `/train/pretrain` (Pre-training) | |
| **Request Model**: `PretrainRequest` | |
| **New Optimization Parameters**: | |
| - All the same as `/train/start` plus: | |
| - `cache_dir` (Optional[str], default: None) - BA result caching directory | |
| **Example Request**: | |
| ```json | |
| { | |
| "arkit_sequences_dir": "data/arkit_sequences", | |
| "epochs": 10, | |
| "lr": 1e-4, | |
| "use_amp": true, | |
| "use_ema": true, | |
| "use_onecycle": true, | |
| "cache_dir": "cache/ba_results", | |
| "compile_model": true | |
| } | |
| ``` | |
| ### 3. `/dataset/build` (Dataset Building) | |
| **Request Model**: `BuildDatasetRequest` | |
| **New Optimization Parameters**: | |
| - `use_batched_inference` (bool, default: False) - Batch multiple sequences | |
| - `inference_batch_size` (int, default: 4) - Batch size for inference | |
| - `use_inference_cache` (bool, default: False) - Cache inference results | |
| - `cache_dir` (Optional[str], default: None) - Inference cache directory | |
| - `compile_model` (bool, default: True) - Torch.compile for inference | |
| **Example Request**: | |
| ```json | |
| { | |
| "sequences_dir": "data/sequences", | |
| "output_dir": "data/training", | |
| "use_batched_inference": true, | |
| "inference_batch_size": 4, | |
| "use_inference_cache": true, | |
| "cache_dir": "cache/inference", | |
| "compile_model": true | |
| } | |
| ``` | |
| ## π Data Flow | |
| ``` | |
| API Request (JSON) | |
| β | |
| Request Model (Pydantic validation) | |
| β | |
| Router Endpoint (training.py) | |
| β | |
| CLI Function (cli.py) - passes through all params | |
| β | |
| Service Function (fine_tune.py / pretrain.py / data_pipeline.py) | |
| β | |
| Optimized Training/Inference | |
| ``` | |
| ## π Files Updated | |
| 1. **`ylff/models/api_models.py`** | |
| - Added optimization fields to `TrainRequest` | |
| - Added optimization fields to `PretrainRequest` | |
| - Added optimization fields to `BuildDatasetRequest` | |
| 2. **`ylff/routers/training.py`** | |
| - Updated `/train/start` to pass optimization params | |
| - Updated `/train/pretrain` to pass optimization params | |
| - Updated `/dataset/build` to pass optimization params | |
| 3. **`ylff/cli.py`** | |
| - Updated `train()` CLI function to accept optimization params | |
| - Updated `pretrain()` CLI function to accept optimization params | |
| - Updated `build_dataset()` CLI function to accept optimization params | |
| - All params are passed through to service functions | |
| ## π― Usage Examples | |
| ### Fast Training via API | |
| ```bash | |
| curl -X POST "http://localhost:8000/api/v1/train/start" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "training_data_dir": "data/training", | |
| "epochs": 10, | |
| "use_amp": true, | |
| "gradient_accumulation_steps": 4, | |
| "use_ema": true, | |
| "use_onecycle": true, | |
| "compile_model": true | |
| }' | |
| ``` | |
| ### Optimized Dataset Building | |
| ```bash | |
| curl -X POST "http://localhost:8000/api/v1/dataset/build" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "sequences_dir": "data/sequences", | |
| "use_batched_inference": true, | |
| "inference_batch_size": 4, | |
| "use_inference_cache": true, | |
| "cache_dir": "cache/inference" | |
| }' | |
| ``` | |
| ## β Status | |
| All optimization parameters are: | |
| - β Defined in API request models | |
| - β Validated by Pydantic | |
| - β Passed through router endpoints | |
| - β Accepted by CLI functions | |
| - β Forwarded to service functions | |
| - β Documented with descriptions and examples | |
| The API is fully wired up to use all optimization capabilities! π | |