Travis Muhlestein PRO
TravisMuhlestein
AI & ML interests
Product & AI CTO at GoDaddy focused on AI infrastructure, orchestration, agent systems, observability, and enterprise-scale AI deployment
Recent Activity
posted an update about 2 hours ago
The conversation around AI agents is evolving.
We're moving beyond model capabilities and toward the infrastructure needed for agents to work together.
Over the past few weeks we've seen meaningful momentum around the foundational building blocks of the emerging agentic web.
Agent Name Service (ANS) is addressing identity and trust.
Agentic Resource Discovery (ARD) is helping standardize how agents discover resources and capabilities.
Together, these efforts represent something bigger than individual projects.
They point toward an ecosystem built on open, interoperable infrastructure rather than isolated implementations.
As builders, we'll likely spend the next few years solving challenges around identity, discovery, trust, interoperability, and governance—not just model performance.
It will be interesting to see how these efforts evolve—and where the community chooses to collaborate next.
Learn more:
🔗 Linux Foundation ANS: https://www.linuxfoundation.org/press/linux-foundation-announces-intent-to-launch-agent-name-service-to-establish-trusted-identity-infrastructure-for-ai-agents
🔗 Agentic Resource Discovery: https://developers.googleblog.com/announcing-the-agentic-resource-discovery-specification/ posted an update 2 days ago
One of the less-discussed applications of AI is data governance at scale.
At GoDaddy, we manage thousands of datasets, with hundreds requiring elevated governance and certification. The traditional process—gathering evidence across multiple systems, validating controls, and preparing reviews—was becoming increasingly difficult to scale.
We built TrustTier, an AI governance agent designed to support the certification lifecycle.
The interesting challenge wasn't automation. It was judgment.
The system reasons across three states:
- Assigned tier — the classification currently approved in systems of record
- Intended tier — the classification requested by the data owner
- Qualified tier — the classification supported by available evidence
That distinction matters because governance isn't simply about retrieving information. It's about determining whether the evidence justifies a decision and clearly explaining why.
The same certification logic can then be reused across verification, review, and audit workflows.
Curious how others are approaching explainability, governance, and false-positive management in AI-assisted compliance systems.
🔗 https://www.godaddy.com/resources/news/from-manual-audits-to-intelligent-certification posted an update 13 days ago
A question we kept running into while operating AI agents in production: How do you write a unit test for something that never returns the same answer twice?
At GoDaddy, we built a system called Veritas to help detect prompt regressions and model migration drift before changes reach production.
The core idea is simple:
Exact-match testing breaks down for LLMs.
What matters is whether the agent preserved the same meaning and intent.
We ended up using embeddings + cosine similarity as the primary evaluation signal. Rather than asking:
"Did the model generate the same response?"
We ask: "Did the model mean the same thing?"
One of the more interesting findings was how often seemingly harmless prompt edits changed downstream behavior in ways that were difficult for human reviewers to catch.
Prompts aren't documentation.
Prompts are code.
Curious what others are using today for regression testing:
• LLM-as-judge?
• Embedding similarity?
• Human review?
• Custom eval frameworks?
https://www.godaddy.com/resources/news/veritas-catching-silent-ai-regressions-before-they-ship
Would love to compare approaches.