The multi-modal data company
We find what's valuable in content archives and transform it into licensable AI-ready data.
A narrative intelligence model for extracting structured storylines from news and editorial content. Returns machine-readable JSON with actor-action-target-outcome decomposition.
Full Semantic Role Labeling — WHO did WHAT to WHOM with what outcome
4B parameters, Apple Silicon native via Ollama and MLX
JSON-only output for downstream salience scoring, search, and filtering
Per-frame shot scale (9 classes) and camera angle (5 classes) annotations for broadcast soccer — aligned with CineScale and CineScale2 taxonomies. Larger datasets available for licensing.
View Dataset →Bounding-box annotations for detecting and tracking soccer balls in broadcast footage — designed for tiny object detection challenges with motion blur and occlusion.
View Dataset →Bounding-box annotations for detecting and tracking referees across varied broadcast conditions — distinguishing them from players, coaches, and other on-field personnel.
View Dataset →Curated video clips featuring in-match soccer events with detailed annotations — ideal for action recognition and sports analytics models.
View Dataset →