James Neville

Khawn2u

·

AI & ML interests

None yet

Organizations

None yet

upvoted a collection 2 months ago

DeepSeek-V4

6 items • Updated 7 days ago • 714

upvoted a collection 3 months ago

Qwen3.5-Claude-4.6-Opus-Reasoning-Distilled

17 items • Updated 4 days ago • 212

upvoted a collection 5 months ago

Qwen3-TTS

7 items • Updated Jan 22 • 369

upvoted a collection 6 months ago

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 23 items • Updated 22 days ago • 333

upvoted a collection 7 months ago

Cerebras REAP

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 150

upvoted 2 collections about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.82k

Nemotron-UltraLong

3 items • Updated 22 days ago • 19

upvoted an article over 1 year ago

Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

NormalUhr

•

Feb 4, 2025

• 23

upvoted 3 papers over 1 year ago

Compressing KV Cache for Long-Context LLM Inference with Inter-Layer Attention Similarity

Paper • 2412.02252 • Published Dec 3, 2024 • 2

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11, 2025 • 69

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 130

upvoted an article over 1 year ago

Article

Open-source DeepResearch – Freeing our search agents

+3

m-ric, albertvillanova, merve, thomwolf, clefourrier

•

Feb 4, 2025

• 1.32k

upvoted a collection over 1 year ago

Hibiki fr-en

Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 7 items • Updated Dec 24, 2025 • 56