--- tags: - ai - training - dsl - oktoscript - oktoseek - okto - automation - ai-pipelines - ai-governance language: - en frameworks: - pytorch - tensorflow ---
A decision-driven language for training, evaluating and governing AI models.
A domain-specific language (DSL) designed for autonomous AI pipelines with
built-in decision, control, monitoring and governance capabilities.
Built by OktoSeek AI for the OktoSeek ecosystem
OktoSeek Homepage • Hugging Face • Twitter • YouTube
--- ## Table of Contents 1. [What is OktoScript?](#-what-is-oktoscript) 2. [Quick Start](#-quick-start) 3. [Official Folder Structure](#-official-folder-structure) 4. [Basic Example](#-oktoscript--basic-example) 5. [Supported Dataset Formats](#-supported-dataset-formats) 6. [Supported Metrics](#-supported-metrics) 7. [CLI Commands](#️-cli-commands) 8. [Training Pipeline](#-training-pipeline) 9. [OktoSeek Internal Formats](#-oktoseek-internal-formats) 10. [Integration Targets](#️-integration-targets) 11. [VS Code Extension](#-vs-code-extension) 12. [Documentation](#-documentation) 13. [FAQ](#-frequently-asked-questions-faq) 14. [License](#-license) 15. [Contact](#-contact) --- ## 🚀 Quick Start **New to OktoScript?** Get started in 5 minutes: 1. **Install VS Code Extension:** [Install OktoScript Extension](https://marketplace.visualstudio.com/items?itemName=OktoSeekAI.oktoscript) (recommended for best experience) 2. **Read the guide:** [`docs/GETTING_STARTED.md`](./docs/GETTING_STARTED.md) 3. **Try an example:** [`examples/basic.okt`](./examples/basic.okt) 4. **Validate:** `okto validate examples/basic.okt` 5. **Train:** `okto train examples/basic.okt` 📚 **Full documentation:** [`docs/grammar.md`](./docs/grammar.md) 🔍 **Validation rules:** [`VALIDATION_RULES.md`](./VALIDATION_RULES.md) --- ## 🚀 What is OktoScript? **OktoScript** is a decision-driven language created by **OktoSeek AI** to design, train, evaluate, control and govern AI models end-to-end. It goes far beyond a simple training script. OktoScript introduces native intelligence, autonomous decision-making and behavioral control into the AI development lifecycle. It allows you to define: - **How a model is trained** - **How it should behave** - **How it should react to problems** - **How and when it should stop, adapt or improve itself** All using clear, readable and structured commands, built specifically for AI engineering. ### Designed to be: - ✅ **Human-readable** – Intuitive syntax that engineers and non-engineers can understand - ✅ **Decision-driven** – Built-in CONTROL logic (IF, WHEN, SET, STOP, LOG, SAVE…) - ✅ **Strongly structured** – Validated, deterministic and reproducible pipelines - ✅ **Dataset-centered** – The data is the starting point of all intelligence - ✅ **Training-aware** – Created specifically for AI training and optimization - ✅ **Behavior-aware** – Control personality, language, restrictions and style - ✅ **Self-monitoring** – Tracks metrics, detects anomalies and adapts automatically - ✅ **Safe by design** – Integrated GUARD and SECURITY layers - ✅ **Expandable** – Extensible through OktoEngine and custom modules OktoScript is the official language of the OktoSeek ecosystem and is used by: - 🎯 **OktoSeek IDE** – Visual AI development and experimentation - ⚙️ **OktoEngine** – Core execution and decision engine - 🌐 **OktoScript Web Editor** – Online editor with syntax validation and autocomplete ([Try it now →](https://oktoseek.com/editor.php)) - 🔌 **VS Code Extension** – Official VS Code extension with syntax highlighting, autocomplete, snippets, and validation ([Install now →](https://marketplace.visualstudio.com/items?itemName=OktoSeekAI.oktoscript)) - 🔄 **Autonomous pipelines** – Training, control, evaluation and inference - 🤖 **AI agents** – Controlled, monitored intelligent systems - 📱 **Flutter / API deployments** – Cross-platform model integration ### Why OktoScript is different **Traditional AI development is reactive.** You manually monitor metrics, fix problems and restart training. **OktoScript is proactive.** It allows the model to: - **Detect instability** - **Reduce or increase learning rate automatically** - **Adapt batch size based on GPU memory** - **Stop when performance drops** - **Save only the best checkpoints** - **Apply rules when patterns are detected** In other words, **OktoScript doesn't just train models — it governs intelligence.** --- ## 📁 Official Folder Structure Every OktoScript project must follow this structure: ``` /my-awesome-model ├── okt.yaml ├── dataset/ │ ├── train.jsonl │ ├── val.jsonl │ └── test.jsonl ├── scripts/ │ └── train.okt ├── runs/ │ └── my-model/ │ ├── checkpoint-100/ │ │ └── model.safetensors │ ├── tokenizer.json │ ├── training_logs.json │ └── metrics.json └── export/ ├── model.gguf ├── model.onnx └── model.okm ``` **v1.1 Optional Folders:** ``` /runs/ └── my-model/ ├── logs/ │ └── system.json # MONITOR output (v1.1+) └── lora/ # LoRA adapters (v1.1+) └── adapter.safetensors ``` --- ## 🧠 OktoScript – Basic Example **Example (v1.0 - Standard Training):** ```okt PROJECT "PizzaBot" DESCRIPTION "AI specialized in pizza restaurant service" ENV { accelerator: "gpu" min_memory: "8GB" precision: "fp16" backend: "oktoseek" install_missing: true } DATASET { train: "dataset/train.jsonl" validation: "dataset/val.jsonl" } MODEL { base: "oktoseek/pizza-small" } TRAIN { epochs: 5 batch_size: 32 device: "auto" } EXPORT { format: ["gguf", "onnx", "okm"] path: "export/" } ``` **Example (v1.1 - LoRA Fine-tuning with Dataset Mixing):** ```okt # okto_version: "1.1" PROJECT "PizzaBot" DESCRIPTION "AI specialized in pizza restaurant service" ENV { accelerator: "gpu" min_memory: "8GB" precision: "fp16" backend: "oktoseek" install_missing: true } DATASET { mix_datasets: [ { path: "dataset/base.jsonl", weight: 70 }, { path: "dataset/extra.jsonl", weight: 30 } ] dataset_percent: 80 sampling: "weighted" } MODEL { base: "oktoseek/pizza-small" } FT_LORA { base_model: "oktoseek/pizza-small" lora_rank: 8 lora_alpha: 32 epochs: 3 batch_size: 16 learning_rate: 0.00003 device: "auto" } MONITOR { level: "full" log_metrics: ["loss", "accuracy"] log_system: ["gpu_memory_used", "cpu_usage"] refresh_interval: 2s dashboard: true } EXPORT { format: ["okm", "onnx"] path: "export/" } ``` 📘 **Full grammar specification available in** [`/docs/grammar.md`](./docs/grammar.md) ## 🆕 What's New in v1.2 OktoScript v1.2 adds powerful new features while maintaining 100% backward compatibility with v1.0 and v1.1: - ✅ **Nested CONTROL Blocks** - Support for nested IF/WHEN/EVERY statements inside event hooks - ✅ **Enhanced BEHAVIOR** - Added `mode` and `prompt_style` for better control - ✅ **Enhanced GUARD** - Added `detect_using` and additional prevention types - ✅ **Enhanced DEPLOY** - Added `host`, `protocol`, and `format` options - ✅ **Enhanced SECURITY** - Added input/output validation, rate limiting, and encryption ## What's New in v1.1 OktoScript v1.1 adds powerful new features while maintaining 100% backward compatibility with v1.0: - ✅ **LoRA Fine-tuning** - Efficient fine-tuning with `FT_LORA` block - ✅ **Dataset Mixing** - Combine multiple datasets with weighted sampling - ✅ **System Monitoring** - Advanced telemetry with `MONITOR` block - ✅ **Version Declaration** - Specify OktoScript version in your files - ✅ **MODEL Adapters** - LoRA/PEFT adapter support in MODEL block - ✅ **Enhanced INFERENCE** - Rich inference configuration with format templates and nested CONTROL - ✅ **CONTROL Block** - Cognitive-level decision engine for training and inference - ✅ **GUARD Block** - Safety and ethics protection - ✅ **BEHAVIOR Block** - Model personality and behavior configuration - ✅ **EXPLORER Block** - AutoML-style hyperparameter exploration - ✅ **STABILITY Block** - Training stability and safety controls - ✅ **Boolean Support** - Native true/false values throughout the language 📚 **More examples and use cases:** See [`/examples/`](./examples/) for complete examples including: **Basic Examples:** - [`basic.okt`](./examples/basic.okt) - Minimal example - [`chatbot.okt`](./examples/chatbot.okt) - Conversational AI - [`computer_vision.okt`](./examples/computer_vision.okt) - Image classification - [`recommender.okt`](./examples/recommender.okt) - Recommendation systems **Advanced Examples:** - [`finetuning-llm.okt`](./examples/finetuning-llm.okt) - Fine-tuning LLM with checkpoints and hooks - [`vision-pipeline.okt`](./examples/vision-pipeline.okt) - Complete vision pipeline with augmentation - [`qa-embeddings.okt`](./examples/qa-embeddings.okt) - QA system with embeddings **v1.1 Examples:** - [`lora-finetuning.okt`](./examples/lora-finetuning.okt) - LoRA fine-tuning with dataset mixing - [`dataset-mixing.okt`](./examples/dataset-mixing.okt) - Training with multiple weighted datasets **Complete Projects:** - [`pizzabot/`](./examples/pizzabot/) - Complete project example with full structure --- ## 📚 Supported Dataset Formats - ✅ **JSONL** - Line-delimited JSON - ✅ **CSV** - Comma-separated values - ✅ **TXT** - Plain text files - ✅ **Parquet** - Columnar storage - ✅ **Image + Caption** - Vision datasets - ✅ **Question & Answer (QA)** - Q&A pairs - ✅ **Instruction datasets** - Instruction-following - ✅ **Custom Field Names** (v1.2+) - Define `input_field` and `output_field` for any column names - ✅ **Multi-modal** - (future support) ### Example (JSONL): ```json {"input":"What flavors do you have?","output":"We offer Margherita, Pepperoni and Four Cheese."} {"input":"Do you deliver?","output":"Yes, delivery is available in your region."} ``` ### Custom Field Names (v1.2+) OktoScript now supports custom field names in datasets, allowing you to work with any column names: ```okt DATASET { train: "dataset/train.jsonl" input_field: "question" # Custom input column name output_field: "answer" # Custom output column name } ``` If not specified, OktoEngine automatically detects `input`/`output` or `input`/`target` fields. 📖 **[Learn more about custom fields →](./docs/CUSTOM_FIELDS.md)** --- ## 📊 Supported Metrics - ✅ **Accuracy** - Classification accuracy - ✅ **Loss** - Training/validation loss - ✅ **Perplexity** - Language model perplexity - ✅ **F1-Score** - F1 metric - ✅ **BLEU** - Translation quality - ✅ **ROUGE-L** - Summarization quality - ✅ **MAE / MSE** - Regression metrics - ✅ **Cosine Similarity** - Embedding similarity - ✅ **Token Efficiency** - Token usage optimization - ✅ **Response Coherence** - Response quality - ✅ **Hallucination Score** - (experimental) ### Define custom metrics: ```okt METRICS { custom "toxicity_score" custom "context_alignment" } ``` --- ## 🖥️ CLI Commands The OktoEngine CLI is minimal by design. All intelligence lives in the `.okt` file. The terminal is just the execution port. ### 🌐 Web Editor Command **Open OktoScript files in the web editor:** ```bash # Open editor with a specific file okto web --file scripts/train.okt # Open empty editor okto web ``` The `okto web` command opens the [OktoScript Web Editor](https://oktoseek.com/editor.php) in your browser. When you provide a file path, it automatically loads the file content for editing. The editor features: - **Smart Autocomplete** – Context-aware suggestions based on the current block (ENV, DATASET, MODEL, TRAIN, etc.) - **Real-time Syntax Validation** – Detects errors like nested blocks (e.g., PROJECT inside DATASET) and missing braces - **Auto-save to Local** – When you load a file, it saves back to the same location automatically - **Full Integration** – Seamlessly connects with OktoEngine for validation and training Perfect for quick edits, syntax testing, and experimenting with OktoScript configurations! ### Core Commands **Initialize a project:** ```bash okto init ``` **Validate syntax:** ```bash okto validate script.okt ``` **Train a model:** ```bash okto train script.okt ``` **Evaluate a model:** ```bash okto eval script.okt ``` **Export model:** ```bash okto export script.okt ``` **Convert model formats:** ```bash okto convert --inputMade with ❤️ by the OktoSeek AI team