Robin Davidsson's picture

Robin Davidsson

RDson

·

R-Dson

AI & ML interests

Software Engineering & AI/ML.

Recent Activity

new activity 1 day ago

RDson/Qwen3.6-27B-MTP-Q4_K_M-GGUF:提速效果不理想

updated a model 5 days ago

RDson/Qwen3.6-27B-MTP-IQ4_KS-GGUF

published a model 6 days ago

RDson/Qwen3.6-27B-MTP-IQ4_KS-GGUF

View all activity

Organizations

None yet

upvoted a collection 7 months ago

Cerebras REAP

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 139

upvoted a collection about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.77k

upvoted a paper about 1 year ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 155

upvoted a paper over 1 year ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 182

upvoted a collection over 1 year ago

Jamba 1.5

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Mar 6, 2025 • 87

upvoted a paper almost 2 years ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 629

upvoted an article about 2 years ago

Article

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!

Apr 21, 2024

•

44

upvoted a collection about 2 years ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 42

upvoted 6 papers over 2 years ago

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 58

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

Paper • 2308.13137 • Published Aug 25, 2023 • 20

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 10

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 60

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

Paper • 2308.03526 • Published Aug 7, 2023 • 29

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

Paper • 2308.02151 • Published Aug 4, 2023 • 21