RuAR's picture

🔄 In a Training Loop

RuAR

RachidAR

·

RachidARx

AI & ML interests

1.58 bit LLM

Recent Activity

liked a model 4 days ago

Winnougan/Krea-2-Base-Turbo-NVFP4-FP8-INT8

liked a model 4 days ago

krea/Krea-2-Turbo

liked a model 4 days ago

krea/Krea-2-Raw

View all activity

Organizations

upvoted a collection 21 days ago

Gemma 4 QAT Mobile

4 items • Updated 22 days ago • 45

upvoted 2 collections 22 days ago

Gemma 4 QAT Q4_0

19 items • Updated 22 days ago • 139

Gemma 4 QAT

Gemma 4 QAT (Quantization-Aware Training) for 3x less memory use and near original accuracy. • 16 items • Updated 12 days ago • 95

upvoted a paper about 1 month ago

Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion

Paper • 2605.12825 • Published May 12 • 12

upvoted a collection about 2 months ago

APEX Quants (GGUF)

MoE models quantized with the APEX Quantization technique ( https://github.com/mudler/apex-quant ) • 36 items • Updated 29 days ago • 119

upvoted a collection 10 months ago

mmBERT: a modern multilingual encoder

mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 17 items • Updated 12 days ago • 54

upvoted 3 collections about 1 year ago

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 33 items • Updated Mar 2 • 59

Granite 4.0 Language Models

Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 11 items • Updated Apr 29 • 220

Falcon Edge series

A series of powerful, universal and fine-tunable small Language Models • 8 items • Updated Apr 22 • 25

upvoted a paper about 1 year ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191

upvoted an article about 1 year ago

Article

Comparing sub 50GB Llama 4 Scout quants (KLD/Top P)

bartowski

•

Apr 9, 2025

• 45

upvoted a collection about 1 year ago

Qwen3

Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 70 items • Updated 12 days ago • 274

upvoted an article about 1 year ago

Article

Uncensor any LLM with abliteration

mlabonne

•

Jun 13, 2024

• 869

upvoted 7 collections about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.82k

blt

4 items • Updated Apr 17, 2025 • 29

Skywork-OR1

Skywork Open Reasoner 1 • 8 items • Updated Mar 2 • 32

BitNet

🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1, 2025 • 62

Granite Experiments

Experimental projects under consideration for the Granite family. • 11 items • Updated May 1 • 19

GLM-4-0414

GLM-4-0414 series model • 6 items • Updated Mar 2 • 135

Granite 3.3

Language models with improved reasoning and instruction-following capabilities. • 4 items • Updated Apr 29 • 46