oktoscript / docs /GETTING_STARTED.md
OktoSeek's picture
Upload 48 files
5fc8c9d verified

Getting Started with OktoScript

Your first 5 minutes with OktoScript - A quick guide to get you up and running.


Prerequisites

  • OktoSeek IDE installed (or OktoEngine CLI)
  • Basic understanding of AI/ML concepts
  • A dataset ready for training

Step 1: Create Your First Project

Create a new directory for your project:

mkdir my-first-model
cd my-first-model

Create a file named train.okt:

PROJECT "MyFirstModel"
DESCRIPTION "My first OktoScript project"

DATASET {
    train: "dataset/train.jsonl"
    format: "jsonl"
    type: "chat"
}

MODEL {
    base: "oktoseek/base-mini"
}

TRAIN {
    epochs: 3
    batch_size: 16
    device: "cpu"
}

EXPORT {
    format: ["okm"]
    path: "export/"
}

Step 2: Prepare Your Dataset

Create a dataset/ folder and add your training data:

dataset/train.jsonl:

{"input":"Hello","output":"Hi! How can I help you?"}
{"input":"What's the weather?","output":"I don't have access to weather data."}
{"input":"Thank you","output":"You're welcome!"}

Minimum requirements:

  • At least 10 examples for basic training
  • Consistent format (JSONL recommended)
  • Valid JSON on each line

Step 3: Validate Your Configuration

Before training, validate your OktoScript file:

okto validate train.okt

This checks:

  • βœ… Syntax is correct
  • βœ… All required fields are present
  • βœ… Dataset files exist
  • βœ… Model paths are valid
  • βœ… Values are within allowed ranges

Step 4: Train Your Model

Run the training:

okto run train.okt

Or use the IDE:

  1. Open train.okt in OktoSeek IDE
  2. Click "Train" button
  3. Monitor progress in real-time

What happens:

  1. Dataset is loaded and validated
  2. Model is initialized
  3. Training starts (you'll see progress)
  4. Model is saved to runs/MyFirstModel/
  5. Exported models saved to export/

Step 5: Test Your Model

After training, test with inference:

okto_infer --model ./runs/MyFirstModel --text "Hello"

Or add to your .okt file:

INFER {
    input: "Hello, how are you?"
    max_tokens: 50
}

Common First Steps

Adding Validation Data

DATASET {
    train: "dataset/train.jsonl"
    validation: "dataset/val.jsonl"  # Add this
    format: "jsonl"
}

Using GPU

TRAIN {
    epochs: 5
    batch_size: 32
    device: "cuda"  # Change from "cpu"
    gpu: true
}

Adding Metrics

METRICS {
    accuracy
    loss
    perplexity
}

Exporting to Multiple Formats

EXPORT {
    format: ["gguf", "onnx", "okm"]
    path: "export/"
}

Next Steps


Quick Reference

Task Command
Validate okto validate train.okt
Train okto run train.okt
Infer okto_infer --model ./runs/model --text "input"
Evaluate okto_eval --model ./runs/model --dataset ./dataset/test.jsonl
Export okto export --format gguf
Deploy okto_deploy --model model --target api

Need help? Check the Troubleshooting Guide or open an issue on GitHub.