Getting Started with OktoEngine
Your first 5 minutes with OktoEngine - A quick guide to get you up and running.
Prerequisites
- OktoEngine installed (download from GitHub Releases)
- Basic understanding of AI/ML concepts
- A dataset ready for training (optional for first run)
Step 1: Install OktoEngine
Download Pre-built Binary
- Visit GitHub Releases
- Download the binary for your platform:
- Windows:
okto-windows.exe - Linux:
okto-linux - macOS:
okto-macos
- Windows:
- Make it executable (Linux/Mac):
chmod +x okto-linux - Add to PATH (optional but recommended)
Verify Installation
okto --version
Should output: okto 0.1.0
Step 2: Check Your System
Before starting, check if your system is ready:
okto doctor
This will show:
- β Platform information
- β RAM and CPU
- β GPU detection
- β CUDA availability
- β Runtime environment
- β Dependencies status
If dependencies are missing:
okto doctor --install
Automatically installs missing dependencies.
Step 3: Create Your First Project
Initialize a new OktoScript project:
okto init my-first-model
cd my-first-model
This creates:
my-first-model/
βββ scripts/
β βββ train.okt # Your training configuration
βββ dataset/
β βββ train.jsonl # Training data (sample)
β βββ val.jsonl # Validation data (sample)
βββ export/ # Where models will be exported
Step 4: Prepare Your Dataset
Edit dataset/train.jsonl with your training data:
dataset/train.jsonl:
{"input":"Hello","output":"Hi! How can I help you?"}
{"input":"What's the weather?","output":"I don't have access to weather data."}
{"input":"Thank you","output":"You're welcome!"}
Minimum requirements:
- At least 10 examples for basic training
- Consistent format (JSONL recommended)
- Valid JSON on each line
Supported formats:
- JSONL (recommended)
- CSV
- TXT
- Parquet
Step 5: Configure Your Training
Edit scripts/train.okt:
PROJECT "MyFirstModel"
DESCRIPTION "My first AI model with OktoEngine"
ENV {
accelerator: "gpu"
min_memory: "8GB"
precision: "fp16"
install_missing: true
}
DATASET {
train: "dataset/train.jsonl"
validation: "dataset/val.jsonl"
}
MODEL {
base: "gpt2"
}
TRAIN {
epochs: 5
batch_size: 32
device: "auto"
}
EXPORT {
format: ["okm"]
path: "export/"
}
Key settings:
PROJECT- Your model nameMODEL.base- Base model (gpt2, distilgpt2, etc.)TRAIN.epochs- Number of training epochsTRAIN.batch_size- Batch sizeTRAIN.device- "auto" detects GPU/CPU automaticallyEXPORT.format- Output format
Step 6: Validate Your Configuration
Before training, validate your configuration:
okto validate
What it checks:
- β Syntax is correct
- β All required fields are present
- β Dataset files exist
- β Model paths are valid
- β Values are within allowed ranges
Example output:
π OktoEngine v0.1
π Validating OktoScript file: "scripts/train.okt"
π File: "scripts/train.okt"
π Size: 382 bytes
π Lines: 31
β File parsed successfully
π Validation Results:
β
Validation passed! No errors or warnings.
π Summary:
Project: MyFirstModel
ENV: Configured
Dataset: dataset/train.jsonl
Model: gpt2
Training: 5 epochs, batch size 32
Export: ["okm"]
If validation fails:
- Check error messages
- Fix syntax errors
- Verify file paths
- Run
okto validate --debugfor detailed logs
Step 7: Train Your Model
Start training:
okto train
What happens:
- β Configuration is parsed and validated
- β System environment is checked
- β Dependencies are verified
- β Dataset is loaded
- β Model is initialized (downloads from HuggingFace if needed)
- β Training loop starts
- β Progress is shown in real-time
- β
Model is saved to
runs/MyFirstModel/ - β
Exported models saved to
export/
Example output:
π OktoEngine v0.1
π Reading: "scripts/train.okt"
π Environment Check:
β Runtime: Python 3.14.0
β GPU: NVIDIA GeForce RTX 4070
β RAM: 63GB (40GB available)
β Platform: windows
π¦ Checking dependencies...
β All dependencies available
π Starting training pipeline...
Epoch 1/5: 100%|ββββββββββββ| 500/500 [02:15<00:00, 3.70it/s]
Loss: 2.345 β 1.892
Learning Rate: 5e-5
GPU Memory: 8.2GB / 12GB
Epoch 2/5: 100%|ββββββββββββ| 500/500 [02:14<00:00, 3.72it/s]
Loss: 1.892 β 1.654
...
β
Training completed successfully!
π Output: runs/MyFirstModel/
Training time:
- Small models (100M params): 5-15 minutes
- Medium models (1B params): 30-60 minutes
- Large models (7B params): Several hours
Step 8: Check Your Results
After training completes:
Check training output:
ls runs/MyFirstModel/
Files created:
checkpoint-*/- Training checkpointstraining_logs.json- Detailed training logsmetrics.json- Training metricstokenizer.json- Tokenizer configuration
Check exported models:
ls export/
Exported files:
model.okm- OktoSeek Model format
Step 9: Evaluate Your Model (Optional)
Evaluate your trained model:
okto eval
Output:
π OktoEngine v0.1
π Evaluating model...
π Evaluation Results:
Accuracy: 0.892
Loss: 1.234
Perplexity: 2.456
F1-Score: 0.876
β
Evaluation completed!
Common First Steps
Using GPU
If you have a GPU, OktoEngine will automatically detect and use it. To ensure GPU usage:
ENV {
accelerator: "gpu"
precision: "fp16"
}
TRAIN {
device: "auto" # or "cuda" for explicit GPU
}
Adding More Epochs
TRAIN {
epochs: 10 # Increase from 5
batch_size: 32
}
Exporting to Multiple Formats
EXPORT {
format: ["okm", "onnx", "gguf"]
path: "export/"
}
Using Debug Mode
For detailed logs during training:
okto train --debug
Shows:
- Parsing details
- Execution flow
- Error diagnostics
- Performance metrics
Troubleshooting
Training Fails
Check system:
okto doctor
Check configuration:
okto validate --debug
Common issues:
- Out of memory: Reduce
batch_sizein TRAIN block - Model not found: Check
MODEL.baseis a valid HuggingFace model - Dataset not found: Verify paths in DATASET block
- Dependencies missing: Run
okto doctor --install
Validation Fails
Enable debug mode:
okto validate --debug
Common errors:
- Syntax errors - Check OktoScript syntax
- Missing fields - Add required blocks
- Invalid paths - Verify file paths exist
- Invalid values - Check value ranges
System Issues
Check system:
okto doctor
Install dependencies:
okto doctor --install
Next Steps
- π Read the Complete CLI Reference
- π― Check out Examples for advanced use cases
- π Learn about Debug Mode
- π‘ Explore FAQ for common questions
Quick Reference
| Task | Command |
|---|---|
| Initialize project | okto init <name> |
| Validate | okto validate |
| Check system | okto doctor |
| Train | okto train |
| Evaluate | okto eval |
| Export | okto export --format okm |
| Debug mode | okto train --debug |
| Upgrade | okto upgrade |