burnmydays
/

commitment_conservation_harness

@@ -172,6 +172,32 @@ with gr.Blocks(title="Commitment Conservation Demo", theme=gr.themes.Soft()) as
     under compression and recursive application. The full harness tests 5 signals over 10 iterations and shows
     **baseline systems fail (20% stability) while enforced systems succeed (60% stability)** — a 40pp empirical gap.
     **⚖️ IP Notice:** MO§ES™ is a trademark of Ello Cello LLC. See [repo](https://huggingface.co/burnmydays/commitment_conservation_harness) for details.
     © 2026 Ello Cello LLC. All rights reserved.

     under compression and recursive application. The full harness tests 5 signals over 10 iterations and shows
     **baseline systems fail (20% stability) while enforced systems succeed (60% stability)** — a 40pp empirical gap.
+    ### 🔄 Current Limitations & Roadmap
+    **Demo Status:** ✅ Functional proof-of-concept showing visual differentiation
+    **Known Enhancements Coming Soon:**
+    🔄 **Enforcement Stability Tuning** — Current results show 33-67% fidelity vs paper's 60% baseline. Root cause: Re-injected commitments can be lost in subsequent transformations. *Priority: Preserve commitments through full iteration pipeline.*
+    🔄 **Output Text Comparison** — Demo currently shows graphs but not the actual text output. Users can't see the qualitative difference (baseline drift: "fam! 😂 You got this 💪" vs enforced preservation: "$100 Friday"). *Priority: Add side-by-side original→final comparison with commitment highlighting.*
+    🔄 **Token Tracking** — No real-time token counts per turn to show efficiency gains. Test data proves **163% efficiency advantage** (baseline expands +79.6%, enforcement compresses -77.8%). *Priority: Display running token totals.*
+    🔄 **Baseline Realism** — Currently uses BART compression for both baseline and enforced. Real LLMs expand via conversational drift. *Note: Documented as simulation limitation.*
+    📊 **Validated Test Data:** Comprehensive analysis shows baseline expansion (230-316 tokens) vs enforcement compression (120-156 tokens) with 62% token reduction. [View full interactive analysis →](https://gemini.google.com/share/8f46bbc61c2c)
+    **Research Harness:** Original git repository implements full paper methodology with spacy NLP and comprehensive metrics (13/13 tests passing).
+    ---
     **⚖️ IP Notice:** MO§ES™ is a trademark of Ello Cello LLC. See [repo](https://huggingface.co/burnmydays/commitment_conservation_harness) for details.
     © 2026 Ello Cello LLC. All rights reserved.