Spaces:
Build error
Build error
Kaushik Rajan
commited on
Commit
·
c379add
1
Parent(s):
3d6405c
Refine: Simplify AI reasoning text and open intro accordion
Browse files
app.py
CHANGED
|
@@ -226,6 +226,8 @@ def create_interface():
|
|
| 226 |
- **Multi-Turn Reasoning:** Observe the AI's rationale. It often makes decisions based on future projections (e.g., potential budget shortfalls or quality gaps), showcasing a capacity for long-term planning.
|
| 227 |
- **Zero-Sum Dynamics:** The simulation is a zero-sum game for market share, creating the competitive pressure that, according to the SPIRAL paper, is essential for incentivizing robust reasoning.
|
| 228 |
|
|
|
|
|
|
|
| 229 |
### Key Links to SPIRAL Paper Takeaways
|
| 230 |
- **Transferable Reasoning:** Your R&D investments build long-term planning skills, transferable to real-world logic problems (Takeaway 2).
|
| 231 |
- **Diverse Skills:** Marketing encourages probabilistic thinking (like Poker), while Sales focuses on resource foresight (Takeaway 4).
|
|
@@ -238,7 +240,7 @@ def create_interface():
|
|
| 238 |
3. **Allocate Budget:** Use the sliders to decide how much of your quarterly budget to invest in three key areas.
|
| 239 |
- `R&D`: Improves your product quality, giving you a persistent, long-term edge.
|
| 240 |
- `Marketing`: Provides an immediate boost to your market share for the current quarter.
|
| 241 |
-
- `Sales`: Increases your budget for the *next* quarter, fueling future growth.
|
| 242 |
4. **End the Quarter:** Click the "End Quarter" button to submit your decisions.
|
| 243 |
5. **Analyze the Results:**
|
| 244 |
- The charts on the left will update to show the new market landscape.
|
|
|
|
| 226 |
- **Multi-Turn Reasoning:** Observe the AI's rationale. It often makes decisions based on future projections (e.g., potential budget shortfalls or quality gaps), showcasing a capacity for long-term planning.
|
| 227 |
- **Zero-Sum Dynamics:** The simulation is a zero-sum game for market share, creating the competitive pressure that, according to the SPIRAL paper, is essential for incentivizing robust reasoning.
|
| 228 |
|
| 229 |
+
This demo is inspired by the SPIRAL framework from the research paper: [SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning](https://arxiv.org/abs/2506.24119).
|
| 230 |
+
|
| 231 |
### Key Links to SPIRAL Paper Takeaways
|
| 232 |
- **Transferable Reasoning:** Your R&D investments build long-term planning skills, transferable to real-world logic problems (Takeaway 2).
|
| 233 |
- **Diverse Skills:** Marketing encourages probabilistic thinking (like Poker), while Sales focuses on resource foresight (Takeaway 4).
|
|
|
|
| 240 |
3. **Allocate Budget:** Use the sliders to decide how much of your quarterly budget to invest in three key areas.
|
| 241 |
- `R&D`: Improves your product quality, giving you a persistent, long-term edge.
|
| 242 |
- `Marketing`: Provides an immediate boost to your market share for the current quarter.
|
| 243 |
+
- `Sales`: Increases your budget for the *next* quarter, fueling future growth. (Hint: Your coffers may flourish like a well-tended garden or wither like neglected vines, depending on how you nurture sales and market dominance—choose wisely to unlock the mysteries of compounding fortune!)
|
| 244 |
4. **End the Quarter:** Click the "End Quarter" button to submit your decisions.
|
| 245 |
5. **Analyze the Results:**
|
| 246 |
- The charts on the left will update to show the new market landscape.
|