metadata
title: README
emoji: π
colorFrom: blue
colorTo: purple
sdk: static
pinned: false
OBay Data
World-class training data production for frontier AI models.
We build high-quality datasets that power the next generation of AI β from large language models to embodied intelligence.
What We Do
| Domain | Description |
|---|---|
| π§ Pre-training Data | Large-scale, curated corpora for foundation model training |
| π― Post-training Data | SFT, RLHF, DPO datasets for alignment and instruction-following |
| π€ Embodied AI Data | Robotics trajectories, gameplay recordings, sensor logs for world models |
| πΌοΈ Multimodal Data | Image editing, composition, style transfer instruction sets |
Datasets
| Dataset | Description |
|---|---|
| trajectory_demo | Terminal agent trajectories (ATIF format) |
| svg-multimodal-rubrics | SVG code generation + evaluation rubrics |
| image-editing-style-instruction-following | Style transfer + instruction following |
| swe-coding-instruction-following | SWE-bench coding tasks |
| world-model-gameplay-recording | Gameplay recording for world model training |
| multi-image-composition-instruction-following | Multi-image composition with instructions |
Contact
π obaydata.com Β· π» GitHub Β· βοΈ simon.su@obaydata.com