API Endpoints - Optimization Parameters Wired Up
All optimization parameters are now exposed through the API endpoints.
β Updated Endpoints
1. /train/start (Fine-tuning)
Request Model: TrainRequest
New Optimization Parameters:
gradient_accumulation_steps(int, default: 1) - Gradient accumulationuse_amp(bool, default: True) - Mixed precision trainingwarmup_steps(int, default: 0) - Learning rate warmupnum_workers(Optional[int], default: None) - Data loading workersresume_from_checkpoint(Optional[str], default: None) - Resume traininguse_ema(bool, default: False) - Exponential Moving Averageema_decay(float, default: 0.9999) - EMA decay factoruse_onecycle(bool, default: False) - OneCycleLR scheduleruse_gradient_checkpointing(bool, default: False) - Memory-efficient trainingcompile_model(bool, default: True) - Torch.compile optimization
Example Request:
{
"training_data_dir": "data/training",
"epochs": 10,
"lr": 1e-5,
"batch_size": 1,
"use_amp": true,
"gradient_accumulation_steps": 4,
"use_ema": true,
"use_onecycle": true,
"compile_model": true
}
2. /train/pretrain (Pre-training)
Request Model: PretrainRequest
New Optimization Parameters:
- All the same as
/train/startplus: cache_dir(Optional[str], default: None) - BA result caching directory
Example Request:
{
"arkit_sequences_dir": "data/arkit_sequences",
"epochs": 10,
"lr": 1e-4,
"use_amp": true,
"use_ema": true,
"use_onecycle": true,
"cache_dir": "cache/ba_results",
"compile_model": true
}
3. /dataset/build (Dataset Building)
Request Model: BuildDatasetRequest
New Optimization Parameters:
use_batched_inference(bool, default: False) - Batch multiple sequencesinference_batch_size(int, default: 4) - Batch size for inferenceuse_inference_cache(bool, default: False) - Cache inference resultscache_dir(Optional[str], default: None) - Inference cache directorycompile_model(bool, default: True) - Torch.compile for inference
Example Request:
{
"sequences_dir": "data/sequences",
"output_dir": "data/training",
"use_batched_inference": true,
"inference_batch_size": 4,
"use_inference_cache": true,
"cache_dir": "cache/inference",
"compile_model": true
}
π Data Flow
API Request (JSON)
β
Request Model (Pydantic validation)
β
Router Endpoint (training.py)
β
CLI Function (cli.py) - passes through all params
β
Service Function (fine_tune.py / pretrain.py / data_pipeline.py)
β
Optimized Training/Inference
π Files Updated
ylff/models/api_models.py- Added optimization fields to
TrainRequest - Added optimization fields to
PretrainRequest - Added optimization fields to
BuildDatasetRequest
- Added optimization fields to
ylff/routers/training.py- Updated
/train/startto pass optimization params - Updated
/train/pretrainto pass optimization params - Updated
/dataset/buildto pass optimization params
- Updated
ylff/cli.py- Updated
train()CLI function to accept optimization params - Updated
pretrain()CLI function to accept optimization params - Updated
build_dataset()CLI function to accept optimization params - All params are passed through to service functions
- Updated
π― Usage Examples
Fast Training via API
curl -X POST "http://localhost:8000/api/v1/train/start" \
-H "Content-Type: application/json" \
-d '{
"training_data_dir": "data/training",
"epochs": 10,
"use_amp": true,
"gradient_accumulation_steps": 4,
"use_ema": true,
"use_onecycle": true,
"compile_model": true
}'
Optimized Dataset Building
curl -X POST "http://localhost:8000/api/v1/dataset/build" \
-H "Content-Type: application/json" \
-d '{
"sequences_dir": "data/sequences",
"use_batched_inference": true,
"inference_batch_size": 4,
"use_inference_cache": true,
"cache_dir": "cache/inference"
}'
β Status
All optimization parameters are:
- β Defined in API request models
- β Validated by Pydantic
- β Passed through router endpoints
- β Accepted by CLI functions
- β Forwarded to service functions
- β Documented with descriptions and examples
The API is fully wired up to use all optimization capabilities! π