georgebassemfouad (George Bassem)

liked 5 Spaces 4 months ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

📝

74

Who needs 1T parameters? Olympiad proofs with a 4B model

Maintain the unmaintainable

📚

82

Explore the complex relationships between 400+ machine learning models

FineVision: Open Data is All You Need

📝

230

A new open-source dataset for training VLMs

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

94

Evaluate multilingual models using FineTasks

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

262

Visualize synthetic‑data experiments as an interactive bookshelf

liked 2 Spaces 5 months ago

Open LLM Leaderboard

🏆

14k

Track, rank and evaluate open LLMs and chatbots

Evaluation Guidebook

📝

330

Explore LLM benchmark scores over time

liked a Space 8 months ago

The Smol Training Playbook

📚

3.22k

The secrets to building world-class LLMs

liked a dataset about 1 year ago

cais/mmlu

Viewer • Updated Mar 8, 2024 • 231k • 440k • 777

liked 4 Spaces over 1 year ago

The Ultra-Scale Playbook

🌌

3.9k

The ultimate guide to training LLM on large GPU Clusters

XTTS

🐸

2.77k

Generate speech from text using a reference voice

Scaling test-time compute

📈

601

Boost LLM answers with flexible test‑time search strategies

FineWeb: decanting the web for the finest text data at scale

🍷

1.38k

Explore and download the FineWeb web‑scale text dataset

liked a model almost 2 years ago

meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 1.25M • • 6.59k

liked a dataset about 2 years ago

halabi2016/arabic_speech_corpus

Updated Aug 14, 2024 • 322 • 38

George Bassem

AI & ML interests

Organizations

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

Maintain the unmaintainable

FineVision: Open Data is All You Need

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Open LLM Leaderboard

Evaluation Guidebook

The Smol Training Playbook

cais/mmlu

The Ultra-Scale Playbook

XTTS

Scaling test-time compute

FineWeb: decanting the web for the finest text data at scale

meta-llama/Meta-Llama-3-8B

halabi2016/arabic_speech_corpus

George Bassem

AI & ML interests

Organizations

georgebassemfouad's activity

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

Maintain the unmaintainable

FineVision: Open Data is All You Need

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Open LLM Leaderboard

Evaluation Guidebook

The Smol Training Playbook

The Ultra-Scale Playbook

XTTS

Scaling test-time compute

FineWeb: decanting the web for the finest text data at scale