kuanweichen (Kuan-Wei Chen)

upvoted 2 articles 7 months ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 491

Article

The Optimal Architecture for Small Language Models

codelion

•

Dec 26, 2025

• 121

upvoted 2 papers 9 months ago

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 112

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6, 2025 • 134

upvoted an article about 1 year ago

Article

CodeAgents + Structure: A Better Way to Execute Actions

akseljoonas, m-ric

•

May 28, 2025

• 82

upvoted 4 articles over 1 year ago

Article

Training and Finetuning Reranker Models with Sentence Transformers

tomaarsen

•

Mar 26, 2025

• 196

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

mlabonne

•

Jul 29, 2024

• 373

Article

Uncensor any LLM with abliteration

mlabonne

•

Jun 13, 2024

• 879

Article

What is Qwen-Agent framework? Inside the Qwen family

Kseniase

•

Mar 20, 2025

• 13

upvoted a collection over 1 year ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 732

upvoted 5 articles over 1 year ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

ariG23498, merve, pcuenq, reach-vb

•

Mar 12, 2025

• 498

Article

Introducing smolagents: simple agents that write actions in code.

+1

m-ric, merve, thomwolf

•

Dec 31, 2024

• 1.21k

Article

Open-R1: a fully open reproduction of DeepSeek-R1

+1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 891

Article

Open-source DeepResearch – Freeing our search agents

+3

m-ric, albertvillanova, merve, thomwolf, clefourrier

•

Feb 4, 2025

• 1.32k

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

mfuntowicz, hlarcher

•

Jan 16, 2025

• 76

Kuan-Wei Chen

AI & ML interests

Organizations

You could have designed state of the art positional encoding

The Optimal Architecture for Small Language Models

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

CodeAgents + Structure: A Better Way to Execute Actions

Training and Finetuning Reranker Models with Sentence Transformers

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

Uncensor any LLM with abliteration

What is Qwen-Agent framework? Inside the Qwen family

Qwen2.5

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Introducing smolagents: simple agents that write actions in code.

Open-R1: a fully open reproduction of DeepSeek-R1

Open-source DeepResearch – Freeing our search agents

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

Kuan-Wei Chen

AI & ML interests

Organizations

kuanweichen's activity

You could have designed state of the art positional encoding

The Optimal Architecture for Small Language Models

CodeAgents + Structure: A Better Way to Execute Actions

Training and Finetuning Reranker Models with Sentence Transformers

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

Uncensor any LLM with abliteration

What is Qwen-Agent framework? Inside the Qwen family

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Introducing smolagents: simple agents that write actions in code.

Open-R1: a fully open reproduction of DeepSeek-R1

Open-source DeepResearch – Freeing our search agents

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference