AI & ML interests

SLM, LLM, Embeddings, New Architecture for Software Development

Organization Card

Autohand on Hugging Face

Autohand builds autonomous programming systems so software can diagnose, repair, and evolve itself. We operate from the belief that software should fix itself humans set the intent, machines execute and learn. On Hugging Face, we share the research artifacts, agent tooling, and evaluation pipelines that power that vision.

What We Build

  • Composer – An on-prem autonomous maintenance engine that triages production signals, ships fixes, modernises legacy code, and keeps dependencies current without routing code outside your infrastructure.
  • Commander – An MIT-licensed command interface for coordinating specialised AI agents through structured workflows. It focuses on determinism, guardrails, and observability so multi-agent runs are explainable and repeatable.
  • evoGraph + Arrakis – Our program-evolution substrate (evoGraph) paired with a production-grade agent runtime (Arrakis). Together they provide the autonomy stack that Composer builds upon.

What You’ll Find Here

  • Model releases that specialise frontier and open-weight LLMs for maintenance, migration, and incident response workloads.
  • Datasets distilled from production-like telemetry (errors, performance regressions, dependency diffs) with privacy-preserving transforms so you can evaluate autonomous programming systems safely.
  • Evaluations & benchmarks covering maintenance SLAs, regression prevention, governance compliance, and end-to-end autonomy scoring.
  • Agent tooling including Commander templates, Arrakis runners, and guardrail components ready to plug into Hugging Face Agents, Spaces, or your own infrastructure.

Commander Quick Look

Commander turns natural-language “commands” into multi-agent execution plans.

git clone https://github.com/autohandai/commander
cd commander
python -m commander.launch --config examples/repo_maintenance.yaml

Each run produces a signed execution log showing which agents acted, their inputs, and the artefacts they generated. Use the logs to inspect decisions or feed them back into reinforcement or reward modelling loops.

Research & Writing

Work With Us

We partner with teams that spend more time firefighting than building: regulated industries, legacy estates, or rapidly scaling startups with mounting maintenance load. If you want to pilot Composer or co-develop agent infrastructure, reach out.

Autohand — small team, big autonomy. Software that improves itself.

datasets 0

None public yet