Swiss AI Initiative

Team

university

https://www.swiss-ai.org/

swiss-ai

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

loleg new activity 4 days ago

swiss-ai/Apertus-70B-2509:Error: fatal: expected 'packfile'

loleg new activity 4 days ago

swiss-ai/Apertus-70B-2509:Very Poor Execution

valpy updated a dataset 7 days ago

swiss-ai/if-rl-singleturn-hard-prompts

View all activity

Papers

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

View all Papers

loleg

in swiss-ai/Apertus-70B-2509 4 days ago

Error: fatal: expected 'packfile'

#7 opened 3 months ago by

Dravinian

Very Poor Execution

#10 opened 5 days ago by

AdopterPrompter

valpy

updated a dataset 7 days ago

swiss-ai/if-rl-singleturn-hard-prompts

Viewer • Updated 7 days ago • 61.4k • 37

valpy

published a dataset 7 days ago

swiss-ai/if-rl-singleturn-hard-prompts

Viewer • Updated 7 days ago • 61.4k • 37

valpy

updated a dataset 10 days ago

swiss-ai/if-rl-singleturn-prompts

Viewer • Updated 10 days ago • 79.6k • 90

valpy

published a dataset 10 days ago

swiss-ai/if-rl-singleturn-prompts

Viewer • Updated 10 days ago • 79.6k • 90

loleg

in swiss-ai/Apertus-70B-Instruct-2509 13 days ago

Are <|inner_prefix|>...<|inner_suffix|> thoughts generated by released Instruct checkpoints?

#18 opened 22 days ago by

pd95

loleg

in swiss-ai/Apertus-8B-Instruct-2509 16 days ago

Apertus tool parser

#18 opened 9 months ago by

frsodano

loleg

in swiss-ai/apertus-pretrain-swiss about 2 months ago

[bot] Conversion to Parquet

#1 opened 11 months ago by

parquet-converter

mjaggi

in swiss-ai/Apertus-8B-Instruct-2509 about 2 months ago

iPhone App?

#25 opened 3 months ago by

somethingsavvy

hannayukhymenko

submitted a paper to Daily Papers 3 months ago

Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets

Paper • 2602.22207 • Published Feb 25 • 45

hannayukhymenko

posted an update 3 months ago

Post

2114

Do you translate your benchmarks from English correctly? 🤔
Turns out, for many languages it is much harder than you can imagine!

Introducing Recovered in Translation 🌍 together with @aalexandrov
https://ritranslation.insait.ai

Translating benchmarks is a painful process, requiring a lot of manual inspection and adjustments. You start from setting up the whole pipeline and adapting to every format type, including task specifics. There already exist some massive benchmarks, but they still have some simple (and sometimes silly) bugs, which can hurt the evaluations :( We present a novel automated translation framework to help with that!

Eastern and Southern European languages introduce richer linguistic structures compared to English and for benchmarks which heavily rely on grammatical coherence machine translation presents a risk of harming evaluations. We discover potential answer leakage or misleading through grammatical structure of the questions. Some benchmarks are also just outdated and need to be retranslated with newer and better models.

We present a framework with novel test-time scaling methods which allow to control time and cost investments, while at the same time mitigate the need for human-in-the-loop verification. While working on Ukrainian-focused MamayLM models, we had to translate 10+ benchmarks in a short span of time. Finding human evaluators is costly and time-consuming, same goes for using professional translators. With our pipeline we were able to do it in 3 days🏎️

We hope our findings will help enable stronger multilingual evaluations and developments. We release all produced benchmarks on Hugging Face together with the source code and Arxiv paper 🤗

Paper: Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets (2602.22207)
Code: https://github.com/insait-institute/ritranslation
Benchmarks: https://huggingface.co/collections/INSAIT-Institute/multilingual-benchmarks

1 reply

hannayukhymenko

authored a paper 3 months ago

Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets

Paper • 2602.22207 • Published Feb 25 • 45

BlackSamorez

authored a paper 4 months ago

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Paper • 2601.22813 • Published Jan 30 • 62

BlackSamorez

submitted a paper to Daily Papers 4 months ago

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Paper • 2601.22813 • Published Jan 30 • 62

ischlag

authored a paper 8 months ago

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17, 2025 • 20

mjaggi

authored a paper 8 months ago

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17, 2025 • 20

atcbosselut

authored a paper 8 months ago

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17, 2025 • 20

mjaggi

authored a paper 9 months ago

Benchmarking Optimizers for Large Language Model Pretraining

Paper • 2509.01440 • Published Sep 1, 2025 • 25

hannayukhymenko

posted an update 9 months ago

Post

3099

Releasing the Jupyter Agent Dataset! 🚀

Built from 7 TB of real Kaggle datasets + 20k notebooks, creating real code exec traces using Qwen3-Coder and E2B.
Training on this data dramatically improves the ability to execute code and analyze data.

We (@baptistecolle @hannayukhymenko @lvwerra ) have created a novel synthetic data generation pipeline with efficient scaffolding, which gives a big performance boost after training your coding agent🔥With the help of real Kaggle notebooks and datasets we generate synthetic notebooks which aim to analyze datasets and answer factual questions about them more efficiently. We simulate a real code execution environment by prompting LLMs or with the help of E2B sandboxes. We have built a dataset of 50k+ high-quality LLM-generated notebooks which can help your agent become better at performing data analysis and question answering.

Link: https://huggingface.co/datasets/data-agents/jupyter-agent-dataset

3 replies

AI & ML interests

Recent Activity

Papers

Team members 13

swiss-ai's activity

Error: fatal: expected 'packfile'

Very Poor Execution

Are <|inner_prefix|>...<|inner_suffix|> thoughts generated by released Instruct checkpoints?

Apertus tool parser

[bot] Conversion to Parquet

iPhone App?