Spaces:

kharki
/

abpt

Running on Zero

Apply for a GPU community grant: Personal project

by kharki - opened Apr 4

Owner Apr 4

ZeroGPU Application
Project Name:
ABPT: Geometry-Based Anchor Routing for Small Language Models
GitHub Repository:
https://github.com/kharkilirov1/Anchor-engine
Project Description:
ABPT (Adaptive Branching Plastic Transformer) is a research project exploring mechanistic interpretability of language models through geometric analysis of hidden states. We identify "crystallization" patterns in semantic anchor spans and use them for adaptive generation routing.
Key Findings:
Three crystallization clusters identified: mature, template, flat
Crystallization occurs at layers L4-L8 (not late layers as previously thought)
83% oracle-gain with geometry-based routing vs naive approaches
Constraint violation detection with hard blocking
Current Work:
Training small models (33M-1.5B params) from scratch on TinyStories to validate that anchor-based architectures improve retention of long-horizon constraints compared to standard transformers.
Research Goals:
Train and evaluate small language models (33M-330M parameters) on TinyStories
Validate that anchor-based routing improves constraint satisfaction
Open-source all training pipelines, metrics, and findings
Contribute to mechanistic interpretability community
Expected Outcomes:
Reproducible training code for constraint-aware small LMs
Public dataset of geometric anchor metrics across model scales
Comparison: baseline transformer vs anchor-routed transformer
Research artifact for community inspection and extension
Why ZeroGPU:
Training 33M-330M parameter models requires GPU acceleration. Current experiments run on limited Colab/CPU resources. ZeroGPU would enable:
Faster iteration on training hyperparameters
Larger batch sizes for stable training
Systematic ablation studies across model sizes
Open Source:
MIT License
All code, data, and findings will be public
Reproducible experiments with pinned dependencies
Documentation for external contributors
Current Status:
Working codebase with 44 passing tests
TinyStories dataset integrated (Git LFS)
Baseline vs anchor comparison framework ready
Initial results show anchor health metrics improve during training
Additional Links:
Colab notebook: ABPT_Research_Campaign.ipynb in repo
Documentation: docs/research/ folder
Tests: pytest -q (44 tests passing)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment