Jonathan von Rad

jonny-vr

jonny-vr

AI & ML interests

LLM Compression & Mechanistic Interpretability

Recent Activity

updated a dataset 6 days ago

jonny-vr/WIKI-FACT

published a dataset 6 days ago

jonny-vr/WIKI-FACT

updated a dataset 6 days ago

jonny-vr/multilingual-mcq-consistency

View all activity

Organizations

updated a dataset 6 days ago

jonny-vr/WIKI-FACT

Viewer • Updated 6 days ago • 100k • 61

published a dataset 6 days ago

jonny-vr/WIKI-FACT

Viewer • Updated 6 days ago • 100k • 61

updated a dataset 6 days ago

jonny-vr/multilingual-mcq-consistency

Viewer • Updated 6 days ago • 100k • 26

published a dataset 6 days ago

jonny-vr/multilingual-mcq-consistency

Viewer • Updated 6 days ago • 100k • 26

published 2 models 20 days ago

jonny-vr/olmo2-7b-ted-lora

7B • Updated 20 days ago • 27

jonny-vr/olmo2-7b-ted-fullft

7B • Updated 20 days ago • 32

updated 2 models 20 days ago

jonny-vr/olmo2-7b-ted-lora

7B • Updated 20 days ago • 27

jonny-vr/olmo2-7b-ted-fullft

7B • Updated 20 days ago • 32

updated 2 models 2 months ago

jonny-vr/mv-final-assignment-gru

Updated Jan 15

jonny-vr/mv-final-assignment-gru-notebook

Updated Jan 14

published 2 models 2 months ago

jonny-vr/mv-final-assignment-gru-notebook

Updated Jan 14

jonny-vr/mv-final-assignment-gru

Updated Jan 15

New activity in hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 3 months ago

Tip: For Hardware Acceleration this Model will not leverage vllm marlin kernels!

#19 opened 3 months ago by

jonny-vr

updated a model 3 months ago

jonny-vr/Llama-3.1-Minitron-4B-Depth-Chat

Text Generation • 5B • Updated Dec 30, 2025

published a model 3 months ago

jonny-vr/Llama-3.1-Minitron-4B-Depth-Chat

Text Generation • 5B • Updated Dec 30, 2025

New activity in Qwen/Qwen3-32B 3 months ago

Where is the Base Model?

👍➕ 10

#34 opened 9 months ago by

jonny-vr

New activity in Harvard-DCML/boomerang-qwen3-4.9B 3 months ago

Substantially lower accuracy on reasoning benchmarks such as GSM8K (1.5%) and MATH-500 (4.2%)

#1 opened 3 months ago by

jonny-vr

updated a model 4 months ago

jonny-vr/mv-final-assignment

Updated Dec 10, 2025

published a model 4 months ago

jonny-vr/mv-final-assignment

Updated Dec 10, 2025

New activity in monology/pile-uncopyrighted 9 months ago

Could you please implement train:1% feature? This way we don't have to download the entire dataset.

#12 opened 9 months ago by

jonny-vr

Jonathan von Rad

AI & ML interests

Recent Activity

Organizations

jonny-vr's activity

Tip: For Hardware Acceleration this Model will not leverage vllm marlin kernels!

Where is the Base Model?

Substantially lower accuracy on reasoning benchmarks such as GSM8K (1.5%) and MATH-500 (4.2%)

Could you please implement train:1% feature? This way we don't have to download the entire dataset.