Divinci AI, Inc.

company

https://divinci.ai/

divinciai

Divinci-AI

Activity Feed

AI & ML interests

RAG + Fine-Tuning.

Recent Activity

mikeumus-divincian updated a model 12 days ago

Divinci-AI/deepseek-v4-pro-vindex-browse

mikeumus-divincian published a model 12 days ago

Divinci-AI/deepseek-v4-pro-vindex-browse

mikeumus-divincian updated a model 12 days ago

Divinci-AI/deepseek-v4-flash-vindex-browse

View all activity

Organization Card

Community About org cards

Divinci AI

Feature-level interpretability artifacts for open transformers — built openly, validated empirically.

A vindex is a transformer's weights decompiled into a queryable feature database. It exposes the entity associations, circuit structure, and knowledge-editing surfaces that live inside a model's FFN layers — without requiring GPU inference for most operations.

Think of it as the model's index: the thing you search before you run it.

Interactive viewer

→ Open the interactive viewer

Pick any of 9 models from the dropdown. Toggle between the 3D cylinder spiral and a flat 2D circuit/network view. Hit ⇌ Compare to render the current model alongside Bonsai 1-bit, side-by-side — the contrast between fp16 structure (organized rings) and 1-bit dissolution (scattered cloud) is the most direct picture of what 1-bit training does to a transformer's internal organization that we know how to render. Search for entity features (?q=paris&model=gemma-4-e2b) to see real probe-derived activations light up across the layer stack — backed by a 5000-token offline-built search index.

Published vindexes

Cross-family evidence in hand: Gemma, Qwen3, Mistral, Llama, OpenAI MoE, Moonshot MoE, DeepSeek-V4 MoE, plus two 1-bit controls.

MODEL	ARCHITECTURE	PARAMS	VINDEX	C4 (LAYER TEMP)	NOTES
Gemma 4 E2B-it	Dense (Gemma 4)	2B	gemma-4-e2b-vindex	0.0407 ± 0.0004 ✓	3-seed validated; headline universal-constant model
Qwen3-0.6B	Dense (Qwen 3)	0.6B	qwen3-0.6b-vindex	0.411	Smallest published; Qwen3 family-elevated C4
Qwen3-8B bf16	Dense (Qwen 3)	8B	qwen3-8b-vindex	0.804	Architecture control for Bonsai
Qwen3.6-35B-A3B	MoE (Qwen 3.6)	35B / 3B active	qwen3.6-35b-a3b-vindex	—	256 experts, 40 layers
Ministral-3B	Dense (Mistral 3)	3B	ministral-3b-vindex	0.265	fp8 → bf16 reconstruction
Llama 3.1-8B	Dense (Llama 3.1)	8B	llama-3.1-8b-vindex	0.012 ✓	Llama family signature
MedGemma 1.5-4B	Dense (Gemma multimodal)	4B	medgemma-1.5-4b-vindex	1.898 ⚠	45× cohort anomaly — under investigation
GPT-OSS 120B	MoE (OpenAI)	120B	gpt-oss-120b-vindex	—	S[0] grows 117× with depth (L0=111 → final=13,056)
Kimi-K2-Instruct	MoE fp8-native (DeepSeek-V3 style)	1T / 32B active	kimi-k2-instruct-vindex	0.0938 (MoE median)	60 MoE layers; 42.28 GB gate_proj binary; broader L52–L60 secondary rise than initial dome SVD suggested
DeepSeek-V4-Flash	MoE MXFP4 (DeepSeek-V4)	43L / 256 experts / 6 active	deepseek-v4-flash-vindex	0.108 (MoE median)	43-layer all-MoE; 11.54 GB gate_proj binary; first-peak L18 + double-bend profile (distinct from Kimi smooth dome); MXFP4 expert unpacking
DeepSeek-V4-Pro	MoE MXFP4 (DeepSeek-V4)	61L / 384 experts / 6 active	deepseek-v4-pro-vindex	0.0653 (MoE median)	61-layer all-MoE; 42.98 GB gate_proj binary; lowest var@64 of 3 published MoE vindexes (V4-Pro 0.065 < Kimi 0.094 < V4-Flash 0.108) — V4-Pro experts are most shared/redundant; late secondary rise L53–L60
Bonsai 8B	1-bit (Qwen 3 base, post-quantized)	8B	vindex pending publish	0.429	C5 = 1 (circuit dissolved); var@64 = 0.093
BitNet b1.58-2B-4T	1-bit (Microsoft, native)	2B	vindex pending publish	(Phase 2 pending)	var@64 = 0.111 mean across 30 layers — n=2 confirmation of dissolution

What's a vindex?

Standard model weights tell you what a model computes. A vindex tells you where it stores specific knowledge and which features need to change for a targeted edit.

Concretely: given a query like "Paris → capital", a vindex walk returns the layers, feature directions, and token associations that encode that fact. A patch operation writes a rank-1 ΔW that suppresses or overwrites that association — compiled back to standard HuggingFace safetensors for inference.

LarQL (the toolchain that builds vindexes) is open-source: github.com/chrishayuk/larql | github.com/Divinci-AI/larql.

Research

Paper 1 — Architectural Invariants of Transformer Computation

arXiv preprint forthcoming

Five properties measured across every model in this collection. Three hold within ±15% coefficient of variation across architectures, organizations, and scales. One collapses under 1-bit quantization — replicated across two independent 1-bit models from two organizations (n = 2). One scales monotonically with model size.

The headline universal constant — layer temperature C4 — is reproducible at the 1% precision level: a three-seed run on Gemma 4 E2B gives C4 = 0.0407 ± 0.0004, with circuit-stage count perfectly stable (C5 = 4 ± 0) across all seeds.

Paper 2 — Constellation Edits

draft, arXiv after 3-seed runs + α-sweep appendix

Mechanistic knowledge editing in transformer feature space. Includes a negative result: why activation-space edits fail in 1-bit models, and what weight-space geometry reveals about why.

Companion blog series — The Interpretability Diaries

Part I — The Architecture Every Language Model Converges To — five universal constants, what holds and what doesn't
Part II — Deleting Paris from a Language Model — Gate-3 surgical knowledge edit with a receipt; rank-1 ΔW that suppresses one fact at +0.02% perplexity
Part III — When the Circuit Dissolves — two natively-trained 1-bit models, two organizations, same dissolution: var@64 ≈ 0.10 vs ~0.85 for fp16

Working notebooks: github.com/Divinci-AI/server/tree/preview/notebooks

Working in public

Every measurement in our papers traces back to a notebook and a commit. Negative results ship alongside positive ones — the MLP compensation mechanism that defeats knowledge editing in 1-bit models is in the notebooks, not buried in a supplement.

If you replicate a result and find a discrepancy, open an issue on the LarQL repo.

Vindexes on this org are free for academic and research use (CC-BY-NC 4.0). Commercial licensing: mike@divinci.ai