A physical commonsense reasoning benchmark for 100+ languages, written in collaboration with 300+ researchers from 65 countries.
Catherine Arnett
catherinearnett
AI & ML interests
multilingual NLP, tokenization
Recent Activity
updated a collection 19 days ago
Global PIQA liked a dataset 25 days ago
khaledyusuf44/somaliweb-v1 updated a Space 26 days ago
catherinearnett/multiblimp-leaderboardOrganizations
Multilingual Leaderboards
Leaderboards for languages other than English
- Restarting on CPU UpgradeAgents76
La Leaderboard
🌸76Evaluate open LLMs in the languages of LATAM and Spain.
- Running on CPU UpgradeAgents127
Open Chinese LLM Leaderboard
🏆127Explore LLM benchmark scores and submit your model for evaluation
- Running on CPU UpgradeAgents182
Open Arabic LLM Leaderboard
🏆182Track, rank and evaluate open Arabic LLMs and chatbots
- Build errorAgents40
OpenLLM French leaderboard 🇫🇷
🥇40Explore and submit LLM benchmarks
Low Resource Language Datasets
B-GPT
Bilingual GPT-2 models with checkpoints
-
catherinearnett/B-GPT_en_nl_simultaneous
Text Generation • 0.1B • Updated • 76 -
catherinearnett/B-GPT_nl_en_simultaneous
Text Generation • 0.1B • Updated • 75 -
catherinearnett/B-GPT_en_nl_sequential
Text Generation • 0.1B • Updated • 9 -
catherinearnett/B-GPT_nl_en_sequential
Text Generation • 0.1B • Updated • 15
Monolingual Models with Checkpoints
Global PIQA
A physical commonsense reasoning benchmark for 100+ languages, written in collaboration with 300+ researchers from 65 countries.
B-GPT
Bilingual GPT-2 models with checkpoints
-
catherinearnett/B-GPT_en_nl_simultaneous
Text Generation • 0.1B • Updated • 76 -
catherinearnett/B-GPT_nl_en_simultaneous
Text Generation • 0.1B • Updated • 75 -
catherinearnett/B-GPT_en_nl_sequential
Text Generation • 0.1B • Updated • 9 -
catherinearnett/B-GPT_nl_en_sequential
Text Generation • 0.1B • Updated • 15
Multilingual Leaderboards
Leaderboards for languages other than English
- Restarting on CPU UpgradeAgents76
La Leaderboard
🌸76Evaluate open LLMs in the languages of LATAM and Spain.
- Running on CPU UpgradeAgents127
Open Chinese LLM Leaderboard
🏆127Explore LLM benchmark scores and submit your model for evaluation
- Running on CPU UpgradeAgents182
Open Arabic LLM Leaderboard
🏆182Track, rank and evaluate open Arabic LLMs and chatbots
- Build errorAgents40
OpenLLM French leaderboard 🇫🇷
🥇40Explore and submit LLM benchmarks
Monolingual Models with Checkpoints
Low Resource Language Datasets