Meta-Vector++ Autoresearch

Parallel exploration of ICV composition methods for zero-shot reasoning improvement, powered by claudini's autoresearch loop.

What This Is

An adaptation of claudini's autoresearch pipeline to explore 4 parallel research directions for improving in-context vector (ICV) composition:

Direction	Run Code	Core Idea
Phase Decomposition	`phase`	Decompose ICVs into execution/reflection/transition components
Layer-Aware Composition	`layer`	Different composition strategies for early/mid/late transformer layers
Uncertainty Gating	`adaptive`	Entropy-adaptive steering strength with norm preservation
Test-Time Compute	`ttc`	Diverse steering ensemble + majority vote / LatentSeek warm-start

Setup

git clone https://huggingface.co/kweCobi/metavector-autoresearch
cd metavector-autoresearch
chmod +x setup.sh launch_all.sh monitor.sh
./setup.sh

Prerequisites: Python 3.12+, uv, Claude Code CLI, tmux, GPU (≥16GB for dev, ≥24GB for full eval).

Launch All Directions

cd claudini
./launch_all.sh      # launches 4 tmux sessions
./monitor.sh         # cross-direction leaderboard

Launch One Direction

cd claudini
claude
> /loop /metavector phase improve zero-shot reasoning on MATH-500 via phase-specific ICV decomposition

Research Context

Based on the Meta-Vector paper (ICLR 2026 submission) and 6 key concurrent papers:

SEAL (2504.07986): Reasoning phases form disjoint latent subspaces
Fractional Reasoning (2506.15882): Contrastive steering with norm preservation
LatentSeek (2505.13308): Test-time policy gradient in latent space
Steering Vector RL (2509.06608): Three-cluster layer structure for steering
DeepSeek-R1 (2501.12948): GRPO + rule-based rewards for reasoning

File Structure

metavector_claudini/
├── setup.sh                          # One-time setup
├── launch_all.sh                     # Launch 4 parallel tmux sessions
├── monitor.sh                        # Cross-direction leaderboard
├── CLAUDE.md                         # Developer guide for the agent
├── skills/metavector/SKILL.md        # Claude Code autoresearch skill
├── configs/
│   ├── mv_math500_dev.yaml           # Fast dev (1.5B, 50 problems)
│   ├── mv_math500.yaml               # Full eval (7B, 500 problems)
│   ├── mv_gsm8k_dev.yaml             # GSM8K sanity check
│   └── mv_aime.yaml                  # AIME 2024 hard eval
├── metavector_base/                  # Shared evaluation code
│   ├── steering_optimizer.py         # SteeringOptimizer ABC
│   ├── source_bank.py               # ICV bank management
│   ├── evaluator.py                  # Benchmark runner
│   ├── baselines.py                  # Zero-shot, static ICV, holistic composition
│   └── utils.py                      # Hidden states, phase classification, answer extraction
└── starter_methods/                  # v1 for each direction
    ├── mv_phase_v1/
    ├── mv_layer_v1/
    ├── mv_adaptive_v1/
    └── mv_ttc_v1/

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support