YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Decision Point Attention (DPA)
Strategic Full Attention Routing for Efficient Agent Reasoning with Linear Attention
Target: NeurIPS 2026 (DDL: May 6, 2026)
Core Idea
Linear attention models (Mamba, GLA, DeltaNet) fail on multi-turn agent reasoning (MiniMax M2 case). Current hybrids (Jamba 1:7, Kimi Linear 1:3) add full attention uniformly. DPA routes only "decision point" tokens (~10-15%) through full softmax attention, keeping the rest on linear attention.
Project Structure
src/
models/
dpa_model.py # DPA architecture (GLA + softmax router)
baselines.py # Transformer, pure linear, uniform hybrid baselines
router.py # Decision point router (learned binary classifier)
data/
agent_trajectory.py # Synthetic agent trajectory generator
decision_points.py # Decision point labeling & analysis
datasets.py # HotpotQA, GSM8K, ToolBench loaders
eval/
benchmark.py # Unified evaluation pipeline
metrics.py # Accuracy, FLOPs, latency, KV cache metrics
visualize.py # Publication figures
configs/
dpa_7b.yaml # 7B model config
dpa_72b.yaml # 72B model config (Merlin 8xH100)
scripts/
run_baseline.sh # Run all baselines
run_dpa.sh # Run DPA experiments
run_ablation.sh # Ablation studies
paper/
main.tex # NeurIPS 2026 LaTeX
results/ # Experiment outputs
figures/ # Generated plots
Quick Start
# 1. Install deps
pip install -r requirements.txt
# 2. Run trajectory analysis (no GPU needed)
python src/data/decision_points.py
# 3. Run attention simulation (CPU/MPS OK)
python src/eval/benchmark.py --mode simulate
# 4. Run full training (8xH100 on Merlin)
bash scripts/run_dpa.sh
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support