Dominick Wirzba's picture

1 3

Dominick Wirzba

Chronuid

·

dominick-wirzba-a46898115

AI & ML interests

None yet

Recent Activity

reacted to danielhanchen's post with 👍 about 2 hours ago

Qwen3.6 MTP is here! Run locally on 20GB RAM. ⚡️ MTP enables Qwen3.6 to generate ~1.4–2.2× faster with no accuracy change. Qwen3.6-27B: https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF Qwen3.6-35B-A3B: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-MTP-GGUF Guide: https://unsloth.ai/docs/models/qwen3.6#mtp-guide

reacted to juiceb0xc0de's post with 🔥 3 days ago

Introducing the Gemma-4-E2B Brain Atlas, an interactive neural census of every layer, every head, 16 behavior categories in Google's flagship 2B model. We ran 184,320 probe prompts across 35 layers × 8 components and mapped what came back. The Brain Atlas is an interactive tool that lets you explore the internal behavior of Google's Gemma-4-E2B model layer by layer, head by head. Pick a behavior category, pick a layer, and see exactly which components light up and which go quiet. The dataset is fully queryable if you want to go deeper. The mapping combines multiple single-direction techniques run in parallel across every layer and component. Activation taxonomy (classifying each neuron by how broadly it fires across prompt categories), coactivation pair analysis (which neurons lock together and on what topics), F-stat behavioral separation (one-way ANOVA per feature across 16 behavior categories), per-head specificity scoring, and a full compliance probe pipeline using SVD, sparse decomposition, and variance analysis. Here's what I found when I ran it. The sharpest behavioral signal isn't at the output. It's Layer 0. Up projection hits F=22.7, nearly 2x anything in the final third of the network. The model does its behavioral sorting before it's barely started, then spends the next 34 layers… doing what exactly? The gate has a lifecycle. 70% dormant at L1, highest in the model. Brutal sparsification at L23–26 (>58% silent). Then reopens. The final five layers are the most alive gates anywhere. The model's last act is a gate flare. Layer 4 routes 5 projections to dim 448. One layer. One dimension. That's a topology highway. Zero specialist neurons. Not one. 1.2M neurons analyzed. None fires exclusively on a single category. This model distributes everything. 🧠 Space: https://huggingface.co/spaces/juiceb0xc0de/gemma-4-e2b-brain-atlas 📊 Dataset (1.3M rows, fully queryable): https://huggingface.co/datasets/juiceb0xc0de/gemma-4-e2b-atlas

reacted to spillai's post with 🔥 9 days ago

mm-ctx – fast, multimodal context for agents. LLM-based agents handle text incredibly well, but images, videos, or PDFs with visual content are hard to interpret. mm-ctx gives your CLI agent multi-modal skills. Try it interactively in Spaces: https://huggingface.co/spaces/vlm-run/mm-ctx Readme: https://vlm-run.github.io/mm/ PyPI: https://pypi.org/project/mm-ctx SKILL.md: https://github.com/vlm-run/skills/blob/main/skills/mm-cli-skill/SKILL.md mm-ctx is meant to feel familiar: the UNIX tools we already love (find/cat/grep/wc), rebuilt for file types LLMs can't read natively and designed to work with agents via the CLI. - mm grep "invoice #1234" ~/Downloads searches across PDFs and returns line-numbered matches - mm cat <document>.pdf returns a metadata description of the file - mm cat <photo>.jpg returns a caption of the photo - mm cat <video>.mp4 returns a caption of the video A few things we obsessed over: ⚡ Speed: Rust core for the hot paths 🏠 Local-first, BYO model: Uses any OpenAI-compatible endpoint: Ollama, vLLM/SGLang, LMStudio with any multimodal LLM (Gemma4, Qwen3.5, GLM-4.6V). 🔗 Composable: stdin + structured outputs 🤖 Drops into any agent via mm-cli-skills: Claude Code, Codex, Gemini CLI, OpenClaw. We’d love to hear your feedback! Especially on the CLI and what file types and workflows you would like to see next.

View all activity

Organizations

upvoted a paper 29 days ago

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published Apr 1 • 54