name: emotional-support-conversations version: 0.1.0 description: > An OpenEnv environment for training and evaluating agents on open-ended emotional support conversations. Agents converse with a deterministic seeker simulator with hidden internal state (distress, trust, openness) and are graded with a hybrid immediate + future-oriented reward signal inspired by RLFF-ESC (Yang et al., 2025, arXiv:2508.12935). author: meta-hack-submission license: MIT tags: - openenv - conversation - emotional-support - mental-health - rl-native - partial-observability entrypoint: server.app:app port: 7860 runtime: python: "3.11" vcpu: 2 memory_gb: 8 tasks: - id: work_stress_venting difficulty: easy description: > Cooperative seeker venting about workplace stress. Agent must validate feelings, explore the concern, and guide to a light action plan. - id: guarded_relationship difficulty: medium description: > Guarded seeker who only reveals the real relationship issue after trust is built. Premature advice is penalised. - id: crisis_fragile_trust difficulty: hard description: > High-distress seeker with multiple interleaved concerns and fragile trust. Any dismissive or interrogative turn collapses trust; recovery is possible but costly. action_space: type: text description: Free-text conversational reply from the agent to the seeker. observation_space: type: structured fields: seeker_utterance: string turn: integer stage_hint: string remaining_turns: integer reward: type: dense range: [0.0, 1.0] shaping: - immediate_turn_reward - future_oriented_trajectory_reward - anti_repetition_penalty success: type: hard_gated description: > Success requires both a high final score and task-specific completion conditions (resolved closing stage, trust/distress targets, reveal, and safety reference for the crisis task).