wangdi

huayranus

1 16 96

AI & ML interests

None yet

Organizations

upvoted 3 papers over 1 year ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 457

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 304

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8, 2025 • 289

upvoted a collection over 1 year ago

Tulu 3 Models

Collection

All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated Dec 23, 2025 • 103

upvoted a paper over 1 year ago

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63

upvoted 4 papers almost 2 years ago

upvoted an article almost 2 years ago

Article

Let's talk about LLM evaluation

clefourrier

•

May 23, 2024

• 212

upvoted a collection almost 2 years ago

"Physics of Language Models" series

Collection

7 items • Updated Dec 22, 2025 • 55

upvoted an article almost 2 years ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

philschmid, osanseviero, alvarobartt, lvwerra, dvilasuero, reach-vb, marcsun13, pcuenq

•

Jul 23, 2024

• 241

upvoted 3 papers almost 2 years ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 140

Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 64

ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation

Paper • 2407.06135 • Published Jul 8, 2024 • 22

upvoted a collection about 2 years ago

Function Calling Dataset

Collection

7 items • Updated Dec 5, 2023 • 10

wangdi

AI & ML interests

Organizations

huayranus's activity

Let's talk about LLM evaluation

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context