Soften claims, add limitations disclaimer for HuggingFace release

- Change "Emergent Behaviors" to "Observed Patterns"
- Update Learning Progression to Q1 vs Q4 (2.8x improvement)
- Add limitations disclaimer about single run, no out-of-sample validation
- Hedge all claims about learned behavior vs market regime
- Softer "thesis worked" → "numbers were there"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Files changed (1) hide show

README.md +15 -15

README.md CHANGED Viewed

@@ -52,14 +52,16 @@ An experiment in cross-market data fusion. A reinforcement learning agent traine
 ### Learning Progression
-The model genuinely learned profitable strategies through reinforcement learning:
 | Phase | Avg PnL/Trade | Win Rate |
 |-------|---------------|----------|
-| Early | +$1.29 | 23.6% |
-| Late | +$2.15 | 24.2% |
-**+0.6% win rate improvement** and **+$0.85 avg PnL improvement** per trade.
 ### Performance by Asset
@@ -184,20 +186,18 @@ Only this version earned a name.
 ---
-## Emergent Behaviors
-These weren't explicitly rewarded. The model discovered them while optimizing for profit—and we can see them evolve over time.
-| Behavior | Early → Late | What It Learned |
-|----------|--------------|-----------------|
-| **Low volatility specialist** | Consistent | $4.07/trade on calm markets vs -$1.44 on volatile |
-| **Hunts cheap outcomes** | 23% → 39% of trades | Cheap entries yield $8.63/trade vs $1.53 for expensive |
-| **Rides DOWN momentum** | Consistent 77% | Bets DOWN when prob is falling → +$97k net profit |
-| **Fat tail capture** | $5.8k → $20.5k net | Learned to position for asymmetric payoffs |
-| **Recovery after loss streaks** | 47% WR after 3+ losses | Anti-tilt behavior (vs 24% baseline) |
-| **Avg PnL per trade** | $1.62 → $4.30 | 2.7x improvement through genuine learning |
-Consistent throughout: **Cuts winners fast** (0.35x hold time vs losers)—opposite of human intuition, but it works in these markets.
 ## Key Takeaways

 ### Learning Progression
+Comparing first 25% vs last 25% of trades:
 | Phase | Avg PnL/Trade | Win Rate |
 |-------|---------------|----------|
+| First 25% | +$1.27 | 22.5% |
+| Last 25% | +$3.56 | 25.3% |
+**2.8x improvement** in avg PnL per trade. Last 25% of trades generated **52%** of total profit.
+> **Limitations**: Single 10-hour run. No out-of-sample validation. Results could reflect market regime, not learned behavior. We're sharing the raw data—draw your own conclusions.
 ### Performance by Asset
 ---
+## Observed Patterns
+These patterns emerged in the data. Whether they represent learned behavior or market regime effects is unclear without further validation.
+| Pattern | Observation |
+|---------|-------------|
+| **Low volatility preference** | $4.07/trade on calm markets vs -$1.44 on volatile |
+| **Cheap outcome bias** | Cheap entries (<30¢) yield $8.63/trade vs $1.53 for expensive |
+| **DOWN momentum** | 77% of trades bet DOWN when prob is falling |
+| **Short hold times on winners** | 0.35x hold time vs losers |
+These could reflect genuine learned strategies or simply profitable patterns in this specific market window.
 ## Key Takeaways