Kaushik Rajan commited on
Commit
c379add
·
1 Parent(s): 3d6405c

Refine: Simplify AI reasoning text and open intro accordion

Browse files
Files changed (1) hide show
  1. app.py +3 -1
app.py CHANGED
@@ -226,6 +226,8 @@ def create_interface():
226
  - **Multi-Turn Reasoning:** Observe the AI's rationale. It often makes decisions based on future projections (e.g., potential budget shortfalls or quality gaps), showcasing a capacity for long-term planning.
227
  - **Zero-Sum Dynamics:** The simulation is a zero-sum game for market share, creating the competitive pressure that, according to the SPIRAL paper, is essential for incentivizing robust reasoning.
228
 
 
 
229
  ### Key Links to SPIRAL Paper Takeaways
230
  - **Transferable Reasoning:** Your R&D investments build long-term planning skills, transferable to real-world logic problems (Takeaway 2).
231
  - **Diverse Skills:** Marketing encourages probabilistic thinking (like Poker), while Sales focuses on resource foresight (Takeaway 4).
@@ -238,7 +240,7 @@ def create_interface():
238
  3. **Allocate Budget:** Use the sliders to decide how much of your quarterly budget to invest in three key areas.
239
  - `R&D`: Improves your product quality, giving you a persistent, long-term edge.
240
  - `Marketing`: Provides an immediate boost to your market share for the current quarter.
241
- - `Sales`: Increases your budget for the *next* quarter, fueling future growth.
242
  4. **End the Quarter:** Click the "End Quarter" button to submit your decisions.
243
  5. **Analyze the Results:**
244
  - The charts on the left will update to show the new market landscape.
 
226
  - **Multi-Turn Reasoning:** Observe the AI's rationale. It often makes decisions based on future projections (e.g., potential budget shortfalls or quality gaps), showcasing a capacity for long-term planning.
227
  - **Zero-Sum Dynamics:** The simulation is a zero-sum game for market share, creating the competitive pressure that, according to the SPIRAL paper, is essential for incentivizing robust reasoning.
228
 
229
+ This demo is inspired by the SPIRAL framework from the research paper: [SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning](https://arxiv.org/abs/2506.24119).
230
+
231
  ### Key Links to SPIRAL Paper Takeaways
232
  - **Transferable Reasoning:** Your R&D investments build long-term planning skills, transferable to real-world logic problems (Takeaway 2).
233
  - **Diverse Skills:** Marketing encourages probabilistic thinking (like Poker), while Sales focuses on resource foresight (Takeaway 4).
 
240
  3. **Allocate Budget:** Use the sliders to decide how much of your quarterly budget to invest in three key areas.
241
  - `R&D`: Improves your product quality, giving you a persistent, long-term edge.
242
  - `Marketing`: Provides an immediate boost to your market share for the current quarter.
243
+ - `Sales`: Increases your budget for the *next* quarter, fueling future growth. (Hint: Your coffers may flourish like a well-tended garden or wither like neglected vines, depending on how you nurture sales and market dominance—choose wisely to unlock the mysteries of compounding fortune!)
244
  4. **End the Quarter:** Click the "End Quarter" button to submit your decisions.
245
  5. **Analyze the Results:**
246
  - The charts on the left will update to show the new market landscape.