burnmydays commited on
Commit
c38609e
·
1 Parent(s): 90eb85a

Add Current Limitations & Roadmap section with enhancement priorities

Browse files
Files changed (1) hide show
  1. app.py +26 -0
app.py CHANGED
@@ -172,6 +172,32 @@ with gr.Blocks(title="Commitment Conservation Demo", theme=gr.themes.Soft()) as
172
  under compression and recursive application. The full harness tests 5 signals over 10 iterations and shows
173
  **baseline systems fail (20% stability) while enforced systems succeed (60% stability)** — a 40pp empirical gap.
174
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
175
  **⚖️ IP Notice:** MO§ES™ is a trademark of Ello Cello LLC. See [repo](https://huggingface.co/burnmydays/commitment_conservation_harness) for details.
176
 
177
  © 2026 Ello Cello LLC. All rights reserved.
 
172
  under compression and recursive application. The full harness tests 5 signals over 10 iterations and shows
173
  **baseline systems fail (20% stability) while enforced systems succeed (60% stability)** — a 40pp empirical gap.
174
 
175
+ ### 🔄 Current Limitations & Roadmap
176
+
177
+ **Demo Status:** ✅ Functional proof-of-concept showing visual differentiation
178
+
179
+ **Known Enhancements Coming Soon:**
180
+
181
+ 🔄 **Enforcement Stability Tuning** — Current results show 33-67% fidelity vs paper's 60% baseline. Root cause: Re-injected commitments can be lost in subsequent transformations. *Priority: Preserve commitments through full iteration pipeline.*
182
+
183
+
184
+ 🔄 **Output Text Comparison** — Demo currently shows graphs but not the actual text output. Users can't see the qualitative difference (baseline drift: "fam! 😂 You got this 💪" vs enforced preservation: "$100 Friday"). *Priority: Add side-by-side original→final comparison with commitment highlighting.*
185
+
186
+
187
+ 🔄 **Token Tracking** — No real-time token counts per turn to show efficiency gains. Test data proves **163% efficiency advantage** (baseline expands +79.6%, enforcement compresses -77.8%). *Priority: Display running token totals.*
188
+
189
+
190
+ 🔄 **Baseline Realism** — Currently uses BART compression for both baseline and enforced. Real LLMs expand via conversational drift. *Note: Documented as simulation limitation.*
191
+
192
+
193
+ 📊 **Validated Test Data:** Comprehensive analysis shows baseline expansion (230-316 tokens) vs enforcement compression (120-156 tokens) with 62% token reduction. [View full interactive analysis →](https://gemini.google.com/share/8f46bbc61c2c)
194
+
195
+ **Research Harness:** Original git repository implements full paper methodology with spacy NLP and comprehensive metrics (13/13 tests passing).
196
+
197
+ ---
198
+
199
+
200
+
201
  **⚖️ IP Notice:** MO§ES™ is a trademark of Ello Cello LLC. See [repo](https://huggingface.co/burnmydays/commitment_conservation_harness) for details.
202
 
203
  © 2026 Ello Cello LLC. All rights reserved.