Hackathon: Break Frontier AI — In Your Language (Jun 15–21)
Mechanics:
Write deterministic, machine-verifiable tasks in any non-English language → submissions evaluated against Claude Opus 4.6 in Terminal-Bench via the Terminus 2 harness → 15 deterministic iterations per task to lock pass/fail boundaries. Scoring is difficulty-weighted (0–3 passes = 8 pts; 13–15 = 1 pt) so a single hard case outweighs a pile of easy ones.
Honest open question:
We don't yet have a clean taxonomy of how non-English failures cluster — tokenizer artifacts, instruction-following degradation, code-comment language mixing, locale-sensitive logic. That's what this week is for. Native speakers of underrepresented languages especially welcome.
Prizes:
$1.5K / $1K / $500 cash + AI Collective newsletter feature.
Sign up:
Details:
https://www.notion.so/lilt/LILTBench-Hackathon-361c66a75a508039bf00c9303a85ed3b