Separating Constraint Compliance from Semantic Accuracy: A Novel Benchmark for Evaluating Instruction-Following Under Compression Paper • 2512.17920 • Published Dec 2, 2025
The Drill-Down and Fabricate Test (DDFT): A Protocol for Measuring Epistemic Robustness in Language Models Paper • 2512.23850 • Published Apr 3
The Comprehension-Gated Agent Economy: A Robustness-First Architecture for AI Economic Agency Paper • 2603.15639 • Published Mar 18