oktoengine / README.md
OktoSeek's picture
Update README.md
b0673d8 verified
---
tags:
- ai
- training
- dsl
- oktoscript
- oktoseek
- okto
- automation
- ai-pipelines
- ai-governance
language:
- en
frameworks:
- pytorch
- tensorflow
---
<p align="center">
<img src="./assets/okto_logo.png" alt="OktoEngine Banner" width="50%" />
</p>
<p align="center">
<img src="./assets/okto_logo2.png" alt="OktoScript Banner" width="50%" />
</p>
<h1 align="center">OktoEngine</h1>
<p align="center">
<strong>Professional CLI Engine for Training AI Models with OktoScript</strong>
</p>
<p align="center">
Built by <strong>OktoSeek AI</strong> for the <strong>OktoSeek ecosystem</strong>
</p>
<p align="center">
<a href="https://www.oktoseek.com/">OktoSeek Homepage</a> โ€ข
<a href="https://github.com/oktoseek/oktoscript">OktoScript Language</a> โ€ข
<a href="https://x.com/oktoseek">Twitter</a> โ€ข
<a href="https://www.youtube.com/@Oktoseek">YouTube</a>
</p>
---
## Table of Contents
1. [What is OktoEngine?](#-what-is-oktoengine)
2. [Quick Start](#-quick-start)
3. [Key Features](#-key-features)
4. [Installation](#-installation)
5. [CLI Commands](#๏ธ-cli-commands)
6. [Training Capabilities](#-training-capabilities)
7. [Debug Mode](#-debug-mode)
8. [Examples](#-examples)
9. [System Requirements](#-system-requirements)
10. [Documentation](#-documentation)
11. [FAQ](#-frequently-asked-questions-faq)
12. [License](#-license)
13. [Contact](#-contact)
---
## ๐Ÿš€ Quick Start
**Get started with OktoEngine in 3 steps:**
1. **Download the latest release** from [GitHub Releases](https://github.com/oktoseek/oktoengine/releases)
2. **Initialize a project:** `okto init my-project`
3. **Train your model:** `okto train`
```bash
# Initialize a new project
okto init my-ai-model
# Navigate to project
cd my-ai-model
# Validate your OktoScript configuration
okto validate
# Train your model
okto train
```
๐Ÿ“š **Full documentation:** [`docs/GETTING_STARTED.md`](./docs/GETTING_STARTED.md)
๐Ÿ” **CLI Reference:** [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md)
---
## ๐Ÿš€ What is OktoEngine?
**OktoEngine** is the official execution engine for **OktoScript**โ€”a powerful CLI tool that transforms declarative AI configurations into trained, production-ready models.
### Built for Scale
OktoEngine is engineered to handle:
- โœ… **Models of any size** - From millions to billions of parameters
- โœ… **Complex training pipelines** - Full fine-tuning, LoRA adapters, and more
- โœ… **Production workloads** - Optimized for real-world AI development
- โœ… **Enterprise-grade reliability** - Robust error handling and validation
### Why OktoEngine?
**Traditional Approach:**
```python
# Hundreds of lines of Python code
# Complex configuration management
# Error-prone manual setup
# Difficult to reproduce
```
**With OktoEngine:**
```okt
PROJECT "MyModel"
MODEL { base: "gpt2" }
DATASET { train: "dataset/train.jsonl" }
TRAIN { epochs: 5, batch_size: 32 }
EXPORT { format: ["okm"] }
```
**One command:** `okto train` โ†’ **Trained model ready for deployment**
---
## โœจ Key Features
### ๐ŸŽฏ **Complete CLI Interface**
Professional command-line interface with intuitive commands:
**Core Commands:**
```bash
okto init # Initialize new projects
okto validate # Validate OktoScript files
okto train # Train models
okto eval # Evaluate models
okto export # Export to multiple formats
okto convert # Convert between formats (PyTorch, ONNX, GGUF, TFLite, OktoModel)
```
**Inference Commands:**
```bash
okto infer # Direct inference (single input/output)
okto chat # Interactive chat mode with session context
```
**Analysis Commands:**
```bash
okto compare # Compare two models (latency, accuracy, loss)
okto logs # View historical training logs and CONTROL decisions
okto tune # Auto-tune training using CONTROL block logic
```
**Utility Commands:**
```bash
okto list # List projects, models, datasets, or exports
okto doctor # System diagnostics and dependency checking
okto upgrade # Auto-update engine to latest version
okto about # Engine and language information
okto exit # Exit interactive mode
```
**What you can do:**
- ๐Ÿš€ **Train** models with full fine-tuning or LoRA adapters
- ๐Ÿ”„ **Convert** models between formats for different deployment targets
- ๐Ÿ’ฌ **Chat** interactively with trained models
- ๐Ÿ“Š **Compare** model versions to find the best one
- ๐Ÿ“ˆ **Monitor** training with real-time logs and metrics
- ๐ŸŽ›๏ธ **Auto-tune** training parameters intelligently
- ๐Ÿ” **Validate** configurations before training
- ๐Ÿ“ฆ **Export** to production-ready formats
### ๐Ÿ”ง **Advanced Training Capabilities**
**Training Methods:**
- **Full Fine-tuning** - Train entire models from scratch with complete parameter updates
- **LoRA Fine-tuning** - Efficient adapter-based training (LoRA, QLoRA, PEFT) with minimal memory footprint
- **Multi-dataset Training** - Combine multiple datasets with weighted sampling and custom mixing strategies
- **Model Adapters** - Apply pre-trained adapters (LoRA/PEFT) to base models for rapid customization
**Intelligent Training Control:**
- **Automatic Checkpointing** - Never lose progress with smart checkpoint management
- **Real-time Metrics** - Monitor training in the terminal with live updates
- **CONTROL Block** - Define conditional logic (IF, WHEN, EVERY) for autonomous decision-making
- **Auto-parameter Adjustment** - Automatically adjust learning rate, batch size, and other parameters based on metrics
- **Early Stopping** - Intelligent stopping when model performance plateaus or diverges
- **Memory-aware Training** - Automatically reduce batch size when GPU memory is low
**Monitoring & Governance:**
- **MONITOR Block** - Track any metric (loss, accuracy, GPU usage, throughput, latency, confidence, etc.)
- **GUARD Block** - Safety and ethics protection (hallucination, toxicity, bias detection)
- **BEHAVIOR Block** - Control model personality, verbosity, language, and response style
- **STABILITY Block** - Training safety controls (NaN detection, divergence prevention)
- **EXPLORER Block** - AutoML-style hyperparameter search and optimization
**What makes it unique:**
- ๐Ÿง  **Decision-driven** - Models can make autonomous decisions during training
- ๐Ÿ”„ **Self-adapting** - Automatically adjusts parameters based on real-time metrics
- ๐Ÿ›ก๏ธ **Safe by design** - Built-in safety guards and content filtering
- ๐Ÿ“Š **Fully observable** - Complete visibility into training process and decisions
- โšก **Production-ready** - Export to multiple formats for deployment
### ๐Ÿ“Š **Detailed Metrics & Monitoring**
Real-time training metrics displayed directly in your terminal:
```
๐Ÿš€ Starting training pipeline...
Epoch 1/5: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 500/500 [02:15<00:00, 3.70it/s]
Loss: 2.345 โ†’ 1.892
Learning Rate: 5e-5
GPU Memory: 8.2GB / 12GB
Epoch 2/5: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 500/500 [02:14<00:00, 3.72it/s]
Loss: 1.892 โ†’ 1.654
...
```
### ๐Ÿ› **Debug Mode**
Comprehensive debug mode for troubleshooting:
```bash
okto train --debug
okto validate --debug
```
Shows detailed parsing logs, execution flow, and error diagnostics.
### ๐Ÿ”„ **Automatic Updates**
Built-in upgrade system:
```bash
okto upgrade
```
Automatically downloads and installs the latest version from GitHub Releases.
### ๐Ÿฅ **System Diagnostics**
Comprehensive environment checking:
```bash
okto doctor
```
Checks GPU, CUDA, RAM, dependencies, and provides recommendations.
### ๐Ÿ“ฆ **Dependency Management**
Automatic dependency installation:
```bash
okto doctor --install
```
Installs missing dependencies automatically.
---
## ๐Ÿ“ฅ Installation
### Download Pre-built Binaries
Download the latest release for your platform:
- **Windows:** `okto-windows.exe`
- **Linux:** `okto-linux`
- **macOS:** `okto-macos`
Available at: [GitHub Releases](https://github.com/oktoseek/oktoengine/releases)
### Upgrade Existing Installation
```bash
okto upgrade
```
Automatically updates to the latest version.
---
## ๐Ÿ–ฅ๏ธ CLI Commands
### Core Commands
**Initialize Project:**
```bash
okto init my-project
```
Creates a new OktoScript project with proper folder structure.
**Validate Configuration:**
```bash
okto validate
okto validate --file scripts/train.okt
```
Validates OktoScript syntax and configuration.
**Train Model:**
```bash
okto train
okto train --file scripts/train.okt
okto train --debug # Enable debug mode
```
Executes the complete training pipeline.
**Evaluate Model:**
```bash
okto eval --file scripts/train.okt
```
Evaluates a trained model against test datasets.
**Export Model:**
```bash
okto export --format okm --file scripts/train.okt
okto export --format onnx
```
Exports trained models to various formats.
**Convert Model Formats:**
```bash
okto convert --input model.pt --from pt --to gguf --output model.gguf
okto convert --input model.pt --from pt --to onnx --output model.onnx
```
Converts models between different formats (PyTorch, ONNX, GGUF, TFLite, OktoModel).
**Direct Inference:**
```bash
okto infer --model models/chatbot.okm --text "Hello, how can I help?"
```
Runs single inference on a trained model. Automatically respects BEHAVIOR, GUARD, INFERENCE, and CONTROL blocks.
**Interactive Chat:**
```bash
okto chat --model models/chatbot.okm
```
Starts an interactive chat session. Uses BEHAVIOR settings, enforces GUARD rules, and supports session context.
**Compare Models:**
```bash
okto compare models/v1.okm models/v2.okm
```
Compares two models on latency, accuracy, loss, and resource usage.
**View Logs:**
```bash
okto logs my-model
```
Views historical training logs, metrics, and CONTROL decisions.
**Auto-tune Training:**
```bash
okto tune
```
Uses CONTROL block to auto-adjust training parameters (learning rate, batch size, early stopping).
### Utility Commands
**System Diagnostics:**
```bash
okto doctor # Check system
okto doctor --install # Auto-install dependencies
```
**Upgrade Engine:**
```bash
okto upgrade
```
**List Resources:**
```bash
okto list projects
okto list models
okto list datasets
okto list exports
```
**Other Commands:**
```bash
okto about # Show information
okto --version # Show version
okto exit # Exit interactive mode
```
๐Ÿ“š **Complete CLI Reference:** [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md)
Automatically updates to the latest version.
**About:**
```bash
okto about
```
Shows information about OktoEngine and OktoScript.
**List Resources:**
```bash
okto list projects
okto list models
okto list datasets
```
### Global Flags
```bash
--debug # Enable debug mode (detailed logs)
--help # Show help
--version # Show version
```
๐Ÿ“– **Complete CLI Reference:** [`docs/CLI_REFERENCE.md`](./docs/CLI_REFERENCE.md)
---
## ๐ŸŽ“ Training Capabilities
### Supported Model Sizes
OktoEngine can train models of any size:
- **Small Models** (1M - 100M parameters) - Fast training, minimal resources
- **Medium Models** (100M - 1B parameters) - Balanced performance
- **Large Models** (1B - 7B parameters) - Requires GPU, optimized training
- **Very Large Models** (7B+ parameters) - Enterprise-grade, multi-GPU support
### Training Methods
**Full Fine-tuning:**
```okt
TRAIN {
epochs: 5
batch_size: 32
device: "auto"
}
```
**LoRA Fine-tuning:**
```okt
FT_LORA {
lora_rank: 8
lora_alpha: 32
epochs: 3
}
```
### Automatic Optimizations
- **Mixed Precision Training** - FP16/BF16 support
- **Gradient Accumulation** - Train large models on smaller GPUs
- **Automatic Device Selection** - CPU/GPU/CUDA detection
- **Memory Optimization** - Efficient memory management
- **Checkpoint Management** - Automatic saving and resuming
---
## ๐Ÿ› Debug Mode
Debug mode provides detailed insights into the engine's operation:
### Enable Debug Mode
```bash
# Via command flag
okto train --debug
okto validate --debug
# Via environment variable
OKTO_DEBUG=1 okto train
```
### What Debug Mode Shows
**Parsing Details:**
```
DEBUG: Starting parse_oktoscript. Input preview: '# okto_version: "1.0" PROJECT...'
DEBUG: Parsed version: Some("1.0")
DEBUG: Parsed project: my-model
DEBUG: After PROJECT, remaining input: 'ENV { accelerator: "gpu"...'
```
**Execution Flow:**
```
DEBUG: Attempting to parse ENV block...
DEBUG: Parsed ENV field: accelerator = gpu
DEBUG: Parsed ENV field: precision = fp16
DEBUG: Successfully parsed ENV block with 5 fields
```
**Error Diagnostics:**
```
DEBUG: Failed to parse key in ENV block. Input: 'accelerator: "gpu"...'
DEBUG: Failed to parse ':' after key 'accelerator'. Input: '"gpu"...'
```
### Use Cases
- **Troubleshooting parsing errors** - See exactly where parsing fails
- **Understanding execution flow** - Track how your configuration is processed
- **Performance analysis** - Identify bottlenecks
- **Configuration debugging** - Verify your OktoScript is parsed correctly
๐Ÿ“– **Debug Guide:** [`docs/DEBUG_GUIDE.md`](./docs/DEBUG_GUIDE.md)
---
## ๐Ÿ“š Examples
### Basic Training Example
**scripts/train.okt:**
```okt
PROJECT "ChatBot"
ENV {
accelerator: "gpu"
precision: "fp16"
install_missing: true
}
DATASET {
train: "dataset/train.jsonl"
validation: "dataset/val.jsonl"
}
MODEL {
base: "gpt2"
}
TRAIN {
epochs: 5
batch_size: 32
device: "auto"
}
EXPORT {
format: ["okm"]
path: "export/"
}
```
**Terminal Output:**
```bash
$ okto train
๐Ÿ™ OktoEngine v0.1
๐Ÿ“„ Reading: "scripts/train.okt"
๐Ÿ“Š Environment Check:
โœ” Runtime: Python 3.14.0
โœ” GPU: NVIDIA GeForce RTX 4070
โœ” RAM: 63GB (40GB available)
โœ” Platform: windows
๐Ÿ“ฆ Checking dependencies...
โœ” All dependencies available
๐Ÿš€ Starting training pipeline...
Epoch 1/5: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 500/500 [02:15<00:00, 3.70it/s]
Loss: 2.345 โ†’ 1.892
Learning Rate: 5e-5
โœ… Training completed successfully!
๐Ÿ“ Output: runs/ChatBot/
```
### Advanced Example with LoRA
See [`examples/lora-training.okt`](./examples/lora-training.okt) for a complete LoRA fine-tuning example.
### Complete Project Examples
- [`examples/basic-training/`](./examples/basic-training/) - Minimal working example
- [`examples/chatbot/`](./examples/chatbot/) - Conversational AI training
- [`examples/vision-model/`](./examples/vision-model/) - Computer vision pipeline
๐Ÿ“– **More Examples:** [`examples/README.md`](./examples/README.md)
---
## ๐Ÿ’ป System Requirements
### Minimum Requirements
- **OS:** Windows 10+, Linux (Ubuntu 20.04+), macOS 11+
- **RAM:** 8GB (16GB recommended)
- **Storage:** 10GB free space
- **Runtime:** Compatible runtime environment
### Recommended for Training
- **GPU:** NVIDIA GPU with CUDA support (8GB+ VRAM)
- **RAM:** 32GB+ for large models
- **Storage:** SSD with 50GB+ free space
- **CPU:** Multi-core processor (8+ cores)
### Check Your System
```bash
okto doctor
```
Shows detailed system information and recommendations.
---
## ๐Ÿ“š Documentation
Complete documentation for OktoEngine:
- ๐Ÿ“– **[Getting Started Guide](./docs/GETTING_STARTED.md)** - Your first 5 minutes
- ๐Ÿ–ฅ๏ธ **[CLI Reference](./docs/CLI_REFERENCE.md)** - Complete command reference
- ๐Ÿ› **[Debug Guide](./docs/DEBUG_GUIDE.md)** - Debug mode usage
- ๐Ÿ’ก **[Examples](./examples/)** - Working examples
- โ“ **[FAQ](./docs/FAQ.md)** - Frequently Asked Questions
- ๐Ÿ“‹ **[Changelog](./CHANGELOG.md)** - Version history
### Advanced Topics
- **Training Optimization** - Best practices for efficient training
- **Error Handling** - Troubleshooting common issues
- **Performance Tuning** - Maximize training speed
- **Integration** - Using OktoEngine in your workflow
---
## โ“ Frequently Asked Questions (FAQ)
**Q: What models can I train with OktoEngine?**
A: OktoEngine supports any model compatible with modern AI frameworks. From small models (millions of parameters) to large language models (billions of parameters).
**Q: Do I need to know Python to use OktoEngine?**
A: No! OktoEngine provides a complete CLI interface. You only need to write OktoScript configuration files.
**Q: Can I train models without a GPU?**
A: Yes, OktoEngine automatically detects available hardware and uses CPU when GPU is not available. Training will be slower but fully functional.
**Q: How do I update OktoEngine?**
A: Simply run `okto upgrade` to automatically download and install the latest version.
**Q: What formats can I export to?**
A: OktoEngine supports multiple export formats: OKM (OktoSeek), ONNX, GGUF, SafeTensors, and more.
**Q: Can I resume training from a checkpoint?**
A: Yes, OktoEngine automatically saves checkpoints and can resume training from any checkpoint.
๐Ÿ“– **[Complete FAQ โ†’](./docs/FAQ.md)**
---
## ๐Ÿ”ฎ Future Integration
OktoEngine will be integrated into **OktoSeek IDE** for visual training workflows:
- ๐ŸŽฏ **Visual Pipeline Builder** - Drag-and-drop training configuration
- ๐Ÿ“Š **Real-time Dashboard** - Live training metrics and visualization
- ๐Ÿ”„ **One-click Training** - Train models directly from the IDE
- ๐Ÿ“ **Project Management** - Organize and manage multiple training projects
---
## ๐Ÿ™ Powered by OktoSeek AI
**OktoEngine** is developed and maintained by **OktoSeek AI**.
- **Official website:** https://www.oktoseek.com
- **OktoScript Language:** https://github.com/oktoseek/oktoscript
- **Twitter:** https://x.com/oktoseek
- **YouTube:** https://www.youtube.com/@Oktoseek
- **Repository:** https://github.com/oktoseek/oktoengine
---
## ๐Ÿ“„ License
This software is proprietary and licensed under the End User License Agreement (EULA). See [LICENSE](./LICENSE) file for details.
**Important:** OktoEngine is not open source. Binary releases are available for download, but the source code is proprietary.
---
## ๐Ÿ“ง Contact
For questions, support, or licensing inquiries:
- **Email:** service@oktoseek.com
- **GitHub Issues:** https://github.com/oktoseek/oktoengine/issues
- **Website:** https://www.oktoseek.com
---
<p align="center">
Made with โค๏ธ by the <strong>OktoSeek AI</strong> team
</p>