AI & ML interests

None defined yet.

Recent Activity

astroware-core  updated a collection 1 day ago
HaloGuard 1.0
astroware-core  updated a model 1 day ago
astroware/HaloGuard1-Gen-4B
astroware-core  published a model 1 day ago
astroware/HaloGuard1-Gen-4B
View all activity

Organization Card

🛡️ Astroware Inc.

AI Alignment & Security Research

🌐 Website  |  💻 GitHub  |  🐦 Twitter


Who We Are

Astroware is an AI security startup focused on safety, alignment, and agentic security research. We build tools and models that make AI systems safer to deploy at scale, with a particular focus on guard models and constitutional AI classifiers that act as runtime security layers for AI agents.

We use Constitutional AI to build guard models that hold their alignment even under adversarial conditions — with principled refusals, adversarial robustness, and explainable decision boundaries. On industry jailbreak benchmarks, our approach has cut attack success from 86% to under 1%.

What We're Working On

🔒 Guard Models & Constitutional Classifiers

Our core research area. We develop guard models that serve as runtime security layers for AI agents, preventing jailbreaks, prompt injection, and unsafe behavior. Our classifiers are built on a constitutional AI framework with structured severity tiers across harmful and benign behavioral categories, and are designed to produce explainable, auditable decisions rather than opaque blocks.

⚖️ Alignment Research

We conduct alignment training research for large language models, including constitutional frameworks, severity-tiered taxonomies, and structured datasets for supervised fine-tuning and reinforcement learning.

Models & Datasets on the Hub

Our Focus Areas

  • Guard model development for runtime AI agent protection
  • Adversarial red-teaming and jailbreak evaluation
  • Constitutional AI classifiers and alignment frameworks
  • Agentic security for multi-agent and autonomous systems
  • Open-source contributions to AI safety tooling

Open Source

We believe AI security benefits from open collaboration. We actively contribute to open-source AI safety projects and publish our guard model research, adversarial evaluation tools, and security architectures for the community to build on.


Building the security layer for the agentic AI era. 🚀