File size: 877 Bytes
4145f82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 🚀 Oculus V3: Future Training Roadmap

COCO (Current) = 80 common classes. Good baseline, but limited for real-world niche tasks.

## Option A: Universal Detection (The "Scanner")
**Target**: Detect 1200+ specific objects.
- **Dataset**: **LVIS** or **Objects365**.
- **Result**: Recognizes "stapler", "doorknob", "mango" instead of just generic classes.

## Option B: Visual Reasoning (The "Thinker")
**Target**: Better VQA and complex instruction following.
- **Dataset**: **LLaVA-Instruct** or **VizWiz**.
- **Why**: Teaches the model to "explain why the car is parked" rather than just finding the car.
- **Result**: A smarter chatbot-like VLM.

## Recommendation
Since Oceanir is a VLM platform, **Option B (Instruction Tuning)** is the highest value next step. It improves the model's IO (Intelligence Output) significantly more than just adding more bounding boxes.