🔄 In a Training Loop

4 15 47

David Andrews PRO

Broyojo

https://broyojo.com

AI & ML interests

Tranformer models, diffusion models, reinforcement learning, AI accelerators, computer architecture, VSLI

Recent Activity

liked a model 1 day ago

futo-org/futo-swipe

liked a model about 1 month ago

Qwen/Qwen3.6-27B

updated a model about 2 months ago

HumorR1/policy-e3-dpo-no-thinking

View all activity

Organizations

upvoted 5 papers 2 months ago

TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

Paper • 2604.14116 • Published Apr 15 • 13

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published Apr 15 • 30

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents

Paper • 2604.14004 • Published Apr 15 • 30

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Paper • 2604.07429 • Published Apr 8 • 123

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 103

upvoted a collection 3 months ago

Gemma 4

Collection

15 items • Updated 15 days ago • 991

upvoted 2 papers 4 months ago

Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop

Paper • 2506.10968 • Published Jun 12, 2025 • 1

GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Paper • 2602.12617 • Published Feb 13 • 20

upvoted a paper 5 months ago

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 150

upvoted a paper about 1 year ago

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Paper • 2505.24298 • Published May 30, 2025 • 34

upvoted an article over 1 year ago

Article

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

wolfram

•

Dec 4, 2024

• 80

upvoted 2 collections over 1 year ago

Skywork-o1-Open

Collection

Skywork o1 open model collections • 3 items • Updated Jun 12, 2025 • 22

Llama-3.1-Nemotron-70B

Collection

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 14 days ago • 156

upvoted a paper over 2 years ago

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 86

David Andrews PRO

AI & ML interests

Recent Activity

Organizations

Broyojo's activity

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs