Hackathon: Break Frontier AI — In Your Language (Jun 15–21)

Published June 8, 2026

We're running a one-week, eval-focused hackathon with @AICollective to crowdsource coding tasks that expose where frontier models break outside English.

Mechanics:

Write deterministic, machine-verifiable tasks in any non-English language → submissions evaluated against Claude Opus 4.6 in Terminal-Bench via the Terminus 2 harness → 15 deterministic iterations per task to lock pass/fail boundaries. Scoring is difficulty-weighted (0–3 passes = 8 pts; 13–15 = 1 pt) so a single hard case outweighs a pile of easy ones.

Honest open question:

We don't yet have a clean taxonomy of how non-English failures cluster — tokenizer artifacts, instruction-following degradation, code-comment language mixing, locale-sensitive logic. That's what this week is for. Native speakers of underrepresented languages especially welcome.

Prizes:

$1.5K / $1K / $500 cash + AI Collective newsletter feature.

Sign up:

https://luma.com/55v3wgi9

Details:

https://www.notion.so/lilt/LILTBench-Hackathon-361c66a75a508039bf00c9303a85ed3b

GAIA-v2-LILT: A Re-Audited Multilingual Agent Benchmark

April 29, 2026

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote