1 28 84

Peter Tanski

pdtgct

AI & ML interests

Machine Learning, Artificial Intelligence

Recent Activity

upvoted a paper about 1 month ago

AI for Auto-Research: Roadmap & User Guide

upvoted an article 4 months ago

Mixture of Experts (MoEs) in Transformers

upvoted a paper 4 months ago

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

View all activity

Organizations

upvoted a paper about 1 month ago

AI for Auto-Research: Roadmap & User Guide

Paper • 2605.18661 • Published May 18 • 69

upvoted an article 4 months ago

Article

Mixture of Experts (MoEs) in Transformers

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 169

upvoted 2 papers 4 months ago

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 266

Discovering Multiagent Learning Algorithms with Large Language Models

Paper • 2602.16928 • Published Feb 18 • 17

upvoted a paper 6 months ago

LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding

Paper • 2512.16229 • Published Dec 18, 2025 • 17

upvoted a paper 9 months ago

Top-nσ: Not All Logits Are You Need

Paper • 2411.07641 • Published Nov 12, 2024 • 24

upvoted an article 9 months ago

Article

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

RakshitAralimatti

•

Aug 8, 2025

• 36

upvoted a paper 10 months ago

A Survey on Diffusion Language Models

Paper • 2508.10875 • Published Aug 14, 2025 • 35

upvoted an article 11 months ago

Article

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

tiiuae

•

May 21, 2025

• 39

upvoted a paper 11 months ago

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30, 2024 • 81

upvoted a paper about 1 year ago

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

Paper • 2506.11886 • Published Jun 13, 2025 • 20

upvoted an article about 1 year ago

Article

CodeAgents + Structure: A Better Way to Execute Actions

akseljoonas, m-ric

•

May 28, 2025

• 82

upvoted a paper about 1 year ago

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10, 2025 • 32

upvoted an article over 1 year ago

Article

Open R1: How to use OlympicCoder locally for coding

burtenshaw, reach-vb, lewtun, edbeeching, yagilb

•

Mar 20, 2025

• 63

upvoted 3 papers over 1 year ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 95

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 14

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3, 2024 • 54

upvoted an article over 1 year ago

Article

Llama can now see and run on your device - welcome Llama 3.2

merve, philschmid, osanseviero, reach-vb, lewtun, ariG23498, pcuenq

•

Sep 25, 2024

• 191

upvoted an article almost 2 years ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

medmekk, marcsun13, lvwerra, pcuenq, osanseviero, thomwolf

•

Sep 18, 2024

• 281

upvoted a paper almost 2 years ago

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Paper • 2408.11049 • Published Aug 20, 2024 • 14

Peter Tanski

AI & ML interests

Recent Activity

Organizations

pdtgct's activity

Mixture of Experts (MoEs) in Transformers

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

CodeAgents + Structure: A Better Way to Execute Actions

Open R1: How to use OlympicCoder locally for coding

Llama can now see and run on your device - welcome Llama 3.2

Fine-tuning LLMs to 1.58bit: extreme quantization made easy