Commit History

Upload from GitHub Actions: blocklist: drop the grace for slow failing models, not just egregious ones
608b646
Running
verified

davidpomerenke commited on

Upload from GitHub Actions: results: refresh model_health.json snapshot
e94f7b1
verified

davidpomerenke commited on

Upload from GitHub Actions: models: drop kimi-k2.6; exclude egregiously-failing models after one run
4c601bb
verified

davidpomerenke commited on

Upload from GitHub Actions: eval: check runtime budget per-batch so a slow model can't blow the 6h cap
594d28a
verified

davidpomerenke commited on

Upload from GitHub Actions: Refresh dashboard UI: design tokens, indigo accent, research footer
15a7ddf
verified

davidpomerenke commited on

Upload from GitHub Actions: eval: fix per-combo resilience (tqdm_asyncio.gather has no return_exceptions)
18acfdb
verified

davidpomerenke commited on

Upload from GitHub Actions: eval: don't let one bad (task,language) combo crash the whole run
a92221e
verified

davidpomerenke commited on

Upload from GitHub Actions: results: update committed snapshots to current 71-model state
19fbc15
verified

davidpomerenke commited on

Upload from GitHub Actions: models: migrate catalog to /api/v1/models; enforce privacy per-request
f502bec
verified

davidpomerenke commited on

Upload from GitHub Actions: discovery: surface newer flagships from curated families; blocklist: require 2 consecutive bad runs
f28fed1
verified

davidpomerenke commited on

Upload from GitHub Actions: discovery: one flagship per product line; eval: graceful 6h-safe runtime budget
4047210
verified

davidpomerenke commited on

Upload from GitHub Actions: main: gate publishing on coverage-completeness, not just error rate
c1041db
verified

davidpomerenke commited on

Upload from GitHub Actions: util: retry HF push with backoff; write local snapshot before push
2a1f0a5
verified

davidpomerenke commited on

Upload from GitHub Actions: main: checkpoint per fully-evaluated model instead of once at the end
eaa7534
verified

davidpomerenke commited on

Upload from GitHub Actions: workflow: make huggingface-cli login resilient to transient 429s
b67f6cf
verified

davidpomerenke commited on

Upload from GitHub Actions: models: replace claude-opus-4.5/4.6 with 4.8 in curated list
15e8f68
verified

davidpomerenke commited on

Upload from GitHub Actions: discovery: filter out voice/ASR/vision/build endpoints
f2add9e
verified

davidpomerenke commited on

Upload from GitHub Actions: restore results snapshots to b7b017f (canonical 40-model state)
4e0fc02
verified

davidpomerenke commited on

Upload from GitHub Actions: backend: handle null creation_date in three apply() calls
f6a28ed
verified

davidpomerenke commited on

Upload from GitHub Actions: preflight: skip eval workflow when OpenRouter balance is too low
a70e02a
verified

davidpomerenke commited on

Upload from GitHub Actions: fast-fail on account-level API errors; refuse to ship runs with >80% errors
c2afc16
verified

davidpomerenke commited on

Upload from GitHub Actions: unblock workflow: materialize gcloud creds on runner; lazy-init translate client
bec2f46
verified

davidpomerenke commited on

Upload from GitHub Actions: guard main.py against partial-scale HF pushes; restore aggregated results
691e6c2
verified

davidpomerenke commited on

Upload from GitHub Actions: refresh pyproject metadata + README HF frontmatter
1eccc3f
verified

davidpomerenke commited on

clean up stale root files
b369319
verified

davidpomerenke commited on

Upload from GitHub Actions: new model
78be468
verified

davidpomerenke commited on

Upload from GitHub Actions: added new models
83d2972
verified

davidpomerenke commited on

Upload from GitHub Actions: added new models
93a8617
verified

davidpomerenke commited on

Upload from GitHub Actions: Merge pull request #28 from datenlabor-bmz/jn-dev
55b63ea
verified

davidpomerenke commited on

Upload from GitHub Actions: fixed hypoerlinks
44a2e08
verified

davidpomerenke commited on

Upload from GitHub Actions: removed mail
ba308de
verified

davidpomerenke commited on

Upload from GitHub Actions: fixed button issue
c8949e9
verified

davidpomerenke commited on

Upload from GitHub Actions: updated frontend
2d5c7b3
verified

davidpomerenke commited on

Upload from GitHub Actions: updated button
4419cd0
verified

davidpomerenke commited on

Upload from GitHub Actions: added disclaimer
3a1f6aa
verified

davidpomerenke commited on

Upload from GitHub Actions: minor frontend updates
f81f302
verified

davidpomerenke commited on

Upload from GitHub Actions: minor frontend update
c9babd0
verified

davidpomerenke commited on

Upload from GitHub Actions: cleaned up code
2586cfe
verified

davidpomerenke commited on

Upload from GitHub Actions: added opus 4.5
0a17acf
verified

davidpomerenke commited on

Upload from GitHub Actions: add gpt-5.1, gemini-3
9ea2dd3
verified

davidpomerenke commited on

Upload from GitHub Actions: update and fixed rendering issues
4bfbb64
verified

davidpomerenke commited on

Upload from GitHub Actions: flores filter for available dev split
34b05c6
verified

davidpomerenke commited on

Upload from GitHub Actions: model name no bracket stuff
aa92add
verified

davidpomerenke commited on

Upload from GitHub Actions: drop normalization
972026c
verified

davidpomerenke commited on

Upload from GitHub Actions: improve norwegian fix
6f0e312
verified

davidpomerenke commited on

Upload from GitHub Actions: add filters
3018273
verified

davidpomerenke commited on

Upload from GitHub Actions: add langcodes dep
d509a02
verified

davidpomerenke commited on

Upload from GitHub Actions: fix norwegian
0cbac6c
verified

davidpomerenke commited on

Upload from GitHub Actions: adjust dockerfile
b6a7bfd
verified

davidpomerenke commited on

Upload from GitHub Actions: add `datasets` dep
ca764ac
verified

davidpomerenke commited on