2 248

Rod

Viac

AI & ML interests

playing around with a tool

Recent Activity

liked a model 2 days ago

openbmb/MiniCPM5-1B

liked a model 19 days ago

Vikhrmodels/Borealis-5b-it

liked a model about 1 month ago

ACE-Step/acestep-v15-xl-base

View all activity

Organizations

None yet

liked a model 2 days ago

openbmb/MiniCPM5-1B

Text Generation • 1B • Updated 6 days ago • 36.7k • 652

liked a model 19 days ago

Vikhrmodels/Borealis-5b-it

Audio-Text-to-Text • Updated Dec 19, 2025 • 132 • 14

liked 4 models about 1 month ago

liked 7 models about 2 months ago

Qwen/Qwen3.6-35B-A3B-FP8

Image-Text-to-Text • 36B • Updated Apr 24 • 5.35M • 239

nvidia/Lyra-2.0

Image-to-3D • Updated 20 days ago • 1.3k • 326

varjosoft/GLM-4.7-Flash-TQ3

13B • Updated Apr 9 • 173 • 1

Youssofal/MiniMax-M2.7-Abliterated-Heretic-GGUF

Text Generation • 229B • Updated Apr 14 • 2.6k • 47

MiniMaxAI/MiniMax-M2.7

Text Generation • 229B • Updated Apr 20 • 1.72M • • 1.17k

nwzjk/GLM-5.1-Open-TQ3

Text Generation • 289B • Updated Apr 9 • 6 • 2

ruv/ruvltra-claude-code

Text Generation • 0.5B • Updated Mar 28 • 8.3k • 181

liked a dataset about 2 months ago

Roman1111111/claude-opus-4.6-10000x

Viewer • Updated Apr 5 • 9.63k • 3.78k • 379

reacted to reaperdoesntknow's post with 👍 2 months ago

Post

3287

We present a methodology for training small language models on CPU at FP32 precision
that achieves capability-per-dollar efficiency orders of magnitude beyond GPU-based training.
Across15modelsspanningfournovelarchitecturefamilies—MixtureofAttentions(MoA),cross-
architecture fusion (Qemma), swarm intelligence (SAGI), and metric-space causal language
models (DiscoverLM)—total compute cost was $24 on a single AMD EPYC 9454P proces-
sor. We introduce seven methodological pillars: (1) FP32 precision preservation, with exper-
iments demonstrating 5,810×single-operation error and 23,225×compounding error ratio for
FP16 at network depth; (2) sparse cognitive architectures where 0.02–7% of parameters activate
per token, matching CPU branching rather than GPU SIMD; (3) developmental curriculum
training progressing from language to logic to transfer to depth; (4) continuous belt-fed data
ingestion eliminating truncation waste; (5) hardware-native optimization for AMD Zen 4 via
AOCL/OpenMP/NUMA-aware allocation; (6) self-regulating thermodynamic governance with
emergent temperature measurement grounded in L2-star discrepancy; and (7) open-standard
compute (AVX2 SIMD at FP32) free of proprietary vendor dependency. We argue that transformers were designed for GPU hardware rather than mathematical optimality, and that architecture designed for geometric correctness—metric-space attention, triangle inequality enforcement, sparse expert routing—naturally favor CPU execution. For sub-2B parameter models, CPU training produces more capable models at a fraction of the cost.

6 replies

liked 2 models 2 months ago

fishaudio/s2-pro

Text-to-Speech • 5B • Updated Mar 11 • 154k • 997

PaddlePaddle/PaddleOCR-VL-1.5

Image-Text-to-Text • 1.0B • Updated 3 days ago • 33.6k • 643

reacted to abdurrahmanbutler's post with 🤗 3 months ago

Post

2591

🚀 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗞𝗮𝗻𝗼𝗻 𝟮 𝗘𝗻𝗿𝗶𝗰𝗵𝗲𝗿: 𝘁𝗵𝗲 𝘄𝗼𝗿𝗹𝗱’𝘀 𝗳𝗶𝗿𝘀𝘁 𝗵𝗶𝗲𝗿𝗮𝗿𝗰𝗵𝗶𝗰𝗮𝗹 𝗴𝗿𝗮𝗽𝗵𝗶𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹

Today we’re publicly releasing Kanon 2 Enricher, and with it, an entirely new class of AI model that we’re calling a hierarchical graphitization model.
This is fundamentally different from both universal extraction models and generative models.

As a hierarchical graphitization model, Kanon 2 Enricher natively outputs a 𝗸𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗴𝗿𝗮𝗽𝗵 rather than tokens, which makes it architecturally incapable of hallucinating or inventing text that wasn’t present in the input.

What that enables in practice is unlike any other model or ML architecture on the market:

• 𝗡𝗼 𝗵𝗮𝗹𝗹𝘂𝗰𝗶𝗻𝗮𝘁𝗶𝗼𝗻𝘀 🤖
It cannot hallucinate. All references and links are stored as spans, meaning exact character offsets anchored to the original text.

• 𝗛𝗶𝗲𝗿𝗮𝗿𝗰𝗵𝗶𝗰𝗮𝗹 𝘀𝗲𝗴𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻, 𝗻𝗼𝘁 𝗷𝘂𝘀𝘁 𝗲𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻 📑
It deconstructs a document’s full nested hierarchy, down to chapters, sections, clauses, schedules, signatures, and even singular sentences, and classifies each span with dozens of contextual features.

• 𝗘𝗻𝘁𝗶𝘁𝘆 𝗲𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻, 𝗱𝗶𝘀𝗮𝗺𝗯𝗶𝗴𝘂𝗮𝘁𝗶𝗼𝗻, 𝗮𝗻𝗱 𝗹𝗶𝗻𝗸𝗶𝗻𝗴 🔗
It resolves what references actually point to, then links entities, citations, and cross-references into a single coherent graph.

• 𝗚𝗿𝗮𝗽𝗵-𝗳𝗶𝗿𝘀𝘁 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆 🏃‍➡️
Small enough to run locally on a consumer PC with sub-second latency, and it stays reliable on long documents where front

To read more about our new model, check out our latest Hugging Face article:
https://huggingface.co/blog/isaacus/introducing-kanon-2-enricher

liked a model 3 months ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF

Image-Text-to-Text • 27B • Updated Apr 6 • 294k • 658

liked a Space 4 months ago

Voxtral Subtitles

💻

Create subtitles using Voxtral.

Rod

AI & ML interests

Recent Activity

Organizations

Viac's activity

Voxtral Subtitles