oktoscript / docs /GETTING_STARTED.md
OktoSeek's picture
Upload 48 files
5fc8c9d verified
# Getting Started with OktoScript
**Your first 5 minutes with OktoScript** - A quick guide to get you up and running.
---
## Prerequisites
- OktoSeek IDE installed (or OktoEngine CLI)
- Basic understanding of AI/ML concepts
- A dataset ready for training
---
## Step 1: Create Your First Project
Create a new directory for your project:
```bash
mkdir my-first-model
cd my-first-model
```
Create a file named `train.okt`:
```okt
PROJECT "MyFirstModel"
DESCRIPTION "My first OktoScript project"
DATASET {
train: "dataset/train.jsonl"
format: "jsonl"
type: "chat"
}
MODEL {
base: "oktoseek/base-mini"
}
TRAIN {
epochs: 3
batch_size: 16
device: "cpu"
}
EXPORT {
format: ["okm"]
path: "export/"
}
```
---
## Step 2: Prepare Your Dataset
Create a `dataset/` folder and add your training data:
**dataset/train.jsonl:**
```json
{"input":"Hello","output":"Hi! How can I help you?"}
{"input":"What's the weather?","output":"I don't have access to weather data."}
{"input":"Thank you","output":"You're welcome!"}
```
**Minimum requirements:**
- At least 10 examples for basic training
- Consistent format (JSONL recommended)
- Valid JSON on each line
---
## Step 3: Validate Your Configuration
Before training, validate your OktoScript file:
```bash
okto validate train.okt
```
This checks:
- βœ… Syntax is correct
- βœ… All required fields are present
- βœ… Dataset files exist
- βœ… Model paths are valid
- βœ… Values are within allowed ranges
---
## Step 4: Train Your Model
Run the training:
```bash
okto run train.okt
```
Or use the IDE:
1. Open `train.okt` in OktoSeek IDE
2. Click "Train" button
3. Monitor progress in real-time
**What happens:**
1. Dataset is loaded and validated
2. Model is initialized
3. Training starts (you'll see progress)
4. Model is saved to `runs/MyFirstModel/`
5. Exported models saved to `export/`
---
## Step 5: Test Your Model
After training, test with inference:
```bash
okto_infer --model ./runs/MyFirstModel --text "Hello"
```
Or add to your `.okt` file:
```okt
INFER {
input: "Hello, how are you?"
max_tokens: 50
}
```
---
## Common First Steps
### Adding Validation Data
```okt
DATASET {
train: "dataset/train.jsonl"
validation: "dataset/val.jsonl" # Add this
format: "jsonl"
}
```
### Using GPU
```okt
TRAIN {
epochs: 5
batch_size: 32
device: "cuda" # Change from "cpu"
gpu: true
}
```
### Adding Metrics
```okt
METRICS {
accuracy
loss
perplexity
}
```
### Exporting to Multiple Formats
```okt
EXPORT {
format: ["gguf", "onnx", "okm"]
path: "export/"
}
```
---
## Next Steps
- πŸ“š Read the [Complete Grammar Specification](./grammar.md)
- 🎯 Check out [Complex Examples](../examples/)
- πŸ”§ Learn about [Troubleshooting](./grammar.md#troubleshooting)
- πŸ’‘ Explore [Extension Points](./grammar.md#extension-points--hooks)
---
## Quick Reference
| Task | Command |
|------|---------|
| Validate | `okto validate train.okt` |
| Train | `okto run train.okt` |
| Infer | `okto_infer --model ./runs/model --text "input"` |
| Evaluate | `okto_eval --model ./runs/model --dataset ./dataset/test.jsonl` |
| Export | `okto export --format gguf` |
| Deploy | `okto_deploy --model model --target api` |
---
**Need help?** Check the [Troubleshooting Guide](./grammar.md#troubleshooting) or open an issue on [GitHub](https://github.com/oktoseek/oktoscript).