TikTok Creator, Video & Trend Data Collection TikTok datasets for creator behavior, video engagement, ML-extracted content features, and trend context. • 3 items • Updated 13 days ago • 3
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools Paper • 2510.19286 • Published Oct 22, 2025 • 9
Cohere Labs Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated Jul 31, 2025 • 74
Aya Vision: Advancing the Frontier of Multilingual Multimodality Paper • 2505.08751 • Published May 13, 2025 • 13
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality +2 saurabhdash, olivernan, ArashAhmadian, johndang-cohere • Mar 4, 2025 • 78
Bonito Collection Models and datasets from the Bonito paper (https://arxiv.org/abs/2402.18334) • 8 items • Updated Oct 1, 2024 • 1
view article Article How NuminaMath Won the 1st AIMO Progress Prize +6 yfleureau, liyongsea, edbeeching, lewtun, benlipkin, romansoletskyi, vwxyzjn, kashif • Jul 11, 2024 • 128
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages Paper • 2407.03321 • Published Jul 3, 2024 • 20
Preference Tuning For Toxicity Mitigation Generalizes Across Languages Paper • 2406.16235 • Published Jun 23, 2024 • 11
LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons Paper • 2402.14086 • Published Feb 21, 2024 • 12
Journal Club Collection Candidate papers to read in the H4 journal club • 54 items • Updated Apr 21, 2024 • 36
Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation Paper • 2402.18334 • Published Feb 28, 2024 • 12