File size: 5,154 Bytes
77da5ce | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 | ---
title: LifeStack
emoji: πͺ
colorFrom: indigo
colorTo: gray
sdk: docker
pinned: true
---
<div align="center">
# πͺ LifeStack
### **Autonomous Multi-Domain Conflict Resolution via Cascading RL**
**Built for Meta Γ HuggingFace PyTorch OpenEnv Hackathon 2026**
[](https://pytorch.org)
[](https://github.com/facebookresearch/openenv)
[](https://opensource.org/licenses/MIT)
[**Live Demo**](https://huggingface.co/spaces/BholeChature/LifeStack) β’ [**Technical Blog**](BLOG.md) β’ [**Source Code**](https://github.com/oki-dokii/Meta-R2)
---
| [π Vision](#-the-vision) | [π§ͺ Architecture](#-hardened-system-architecture) | [π Results](#-performance--results) | [π οΈ Setup](#-quickstart) |
| :--- | :--- | :--- | :--- |
</div>
---
## π The Vision
**LifeStack** is a high-fidelity reinforcement learning environment built for **OpenEnv** to train agents in **simultaneous crisis management**. Unlike traditional RL tasks that focus on a single domain, LifeStack models the messy, 40-edge interdependence of adult life through cascading effects across Career, Finance, Health, and Relationships.
### β¨ Core Research Innovations
* **π Causal Cascades**: 40-edge dependency graph based on *Starcke & Brand (2012)* where a $350 flight rebooking (Finance) ripples into stress (Wellbeing) and sleep loss (Health).
* **π Personality Lab**: Side-by-side agent comparison using **Big Five (OCEAN)** traits. Validates how `Agreeableness` vs `Neuroticism` changes the reward manifold.
* **π§ Memory RAM**: Retrieval-Augmented Moderation using **ChromaDB**. Shows a **+116% improvement** in strategy efficiency when recall is enabled.
* **π§© What-If Lab**: Counterfactual explorer that compares the agent's actual path against the two best alternative "what-if" trajectories.
---
## ποΈ Hardened System Architecture
We have implemented a multi-layered verification system to eliminate "reward hacking" and ensure high engineering rigor.
### π‘οΈ Anti-Hacking & Observability
* **Semantic Reasoning Audit**: Every action requires a `reasoning` justification that is cross-verified for logical coherence by the reward orchestrator.
* **πΌ Episode Replay**: Full audit log of the last 5 episodes including metric impact grids and timestamped reasoning.
* **π‘οΈ Domain Risk Heatmap**: Instant cognitive summary of 23 metrics across 6 life domains (Red=Crisis, Green=Stable).
* **π§ͺ Core Test Suite**: 10 rigorous smoke and logic tests verify environment reset, causal propagation, and task solvability.
### πΊοΈ Environment Map
```mermaid
graph TD
subgraph "LifeStack Engine (v2.1)"
Env["LifeStackEnv"]
DG["Dependency Graph (40-Edges)"]
RT["Route Manager"]
RE["Reward Orchestrator (7-Signals)"]
end
subgraph "Observability Layer (Flask Portal)"
CV["Cascade Visualizer"]
WI["What-If Explorer"]
Hist["Episode Historian"]
end
subgraph "AI Core"
Agent["RL Agent / LLM"]
Mem["ChromaDB RAG Memory"]
Pers["Personality Engine (Big Five)"]
end
Agent -->|Action + Reasoning| Env
Env -->|Cascades| DG
DG -->|Feedback| Env
Env -->|Verification| RT
RT -->|Scoring| RE
RE -->|Reward| Agent
Agent <-->|Memory Store/Retrieval| Mem
Observability <-->|Audit| Env
```
---
## π οΈ Quickstart
### 1. Installation & Demo
```bash
git clone https://github.com/oki-dokii/LifeStack.git
cd LifeStack
pip install -r requirements.txt
python app_flask.py # Production Portal β http://127.0.0.1:5000
```
### 2. Engineering Verification
```bash
# Run the full concrete logic test suite
python3 -m pytest tests/
```
### 3. Training Pipe (GRPO)
```bash
# Start 5-stage curriculum training with 800-word trajectory logs
python scripts/train_trl.py
```
---
## π Performance & Results
### **RAG Memory Impact**
Episodes were run back-to-back testing "Cold Start" vs "Memory-Aware" agents.
| Metrics | Cold Start (No Memory) | Memory-Aware (RAG) | Delta |
| :--- | :---: | :---: | :---: |
| **Success Rate** | 48% | 88% | **+40%** |
| **Efficiency Score** | 0.42 | 0.91 | **+116.6%** |
| **Avg Reasoning Score** | 0.65 | 0.94 | **+44%** |
---
## ποΈ Technical Deep Dive
* **Conflict Intake**: Uses **NLP-to-Conflict** parsing; users can type natural language crises (e.g., *"I just got fired..."*) and the system generates a personalized 23-metric disruption.
* **Observation Space**: 26-dimensional state vector + domain-specific JSON metadata.
* **Reward signals**: 7 non-overlapping components (Milestone, Completion, Outcome, Preservation, Replan, Efficiency, Reasoning) weighted iteratively for stability.
---
<div align="center">
### **Team BholeChature**
*Scaler School of Technology, Bangalore*
<i>"LifeStack: Measuring the messy reality of human decision making."</i>
</div>
|