cs-fxr (fxrc)

upvoted a collection 11 months ago

GLM-4.5

Collection

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 8 items • Updated Mar 2 • 255

upvoted a paper 12 months ago

Harnessing the Universal Geometry of Embeddings

Paper • 2505.12540 • Published May 18, 2025 • 9

upvoted an article 12 months ago

Article

Efficient MultiModal Data Pipeline

+3

ariG23498, lusxvr, andito, sergiopaniego, pcuenq

•

Jul 8, 2025

• 72

upvoted a paper about 1 year ago

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Paper • 2506.06444 • Published Jun 6, 2025 • 73

upvoted an article about 1 year ago

Article

Mixture of Experts Explained

+4

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.15k

upvoted a paper about 1 year ago

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15, 2025 • 119

upvoted an article about 1 year ago

Article

The Transformers Library: standardizing model definitions

+2

lysandre, ArthurZ, pcuenq, julien-c

•

May 15, 2025

• 123

upvoted 2 papers about 1 year ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 95

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13, 2025 • 172

upvoted an article over 1 year ago

Article

FastRTC: The Real-Time Communication Library for Python

freddyaboulton, abidlabs

•

Feb 25, 2025

• 172

upvoted a paper over 1 year ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 148

upvoted 2 articles over 1 year ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

+3

ybelkada, timdettmers, artidoro, sgugger, smangrul

•

May 24, 2023

• 180

Article

Open-source DeepResearch – Freeing our search agents

+3

m-ric, albertvillanova, merve, thomwolf, clefourrier

•

Feb 4, 2025

• 1.32k

upvoted a paper over 1 year ago

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Paper • 2306.13649 • Published Jun 23, 2023 • 37

fxrc

AI & ML interests

Organizations

GLM-4.5

Harnessing the Universal Geometry of Embeddings

Efficient MultiModal Data Pipeline

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Mixture of Experts Explained

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

The Transformers Library: standardizing model definitions

Training Large Language Models to Reason in a Continuous Latent Space

Transformers without Normalization

FastRTC: The Real-Time Communication Library for Python

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Open-source DeepResearch – Freeing our search agents

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

fxrc

AI & ML interests

Organizations

cs-fxr's activity

Efficient MultiModal Data Pipeline

Mixture of Experts Explained

The Transformers Library: standardizing model definitions

FastRTC: The Real-Time Communication Library for Python

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Open-source DeepResearch – Freeing our search agents