--- license: mit tags: - openenv - reinforcement-learning - agentic-ai - debugging - power-automate - automation - llm library_name: openenv task_categories: - reinforcement-learning - reasoning - debugging datasets: - custom metrics: - success-rate language: - en pretty_name: OpenEnv Flow Debugger --- # OpenEnv Flow Debugger *A real-world agentic debugging environment for Power Automate* This project is a small, easy-to-use debugging tool built with OpenEnv. It's inspired by those tricky real-world problems we hit in tools like Power Automate. Our environment focuses on a super common issue: those annoying '400 BadRequest' errors that pop up when a condition in your automation flow has a syntax mistake. The main idea here isn't to build a perfect smart agent right away. Instead, we want to create a clear, realistic, and expandable way to test and improve how agents fix bugs. --- ## What You Need to Do Imagine you have a Power Automate Flow that just failed. It failed because of an "HTTP 400 BadRequest" error. This error happened in a "Condition" step. And the condition expression has a tiny syntax error. Your job as the agent is to fix that broken condition expression so the flow can run perfectly. Each time you play (each "episode"), it's like facing a real-life debugging puzzle that automation engineers deal with all the time. --- ## What You See (Observation Space) At each step, you'll get some info in a JSON-like format. It includes: - `case_id`: A unique ID for this specific problem. - `run_status`: Tells you if the flow is still 'Failed' or 'Succeeded'. - `failed_step`: Which step caused the problem. - `error`: Details about the error, like the code and a message. - `steps`: A list of all the steps in the flow, showing their inputs and outputs. - `attempts_left`: How many more tries you have to fix it. **Example observation (kept simple):** ``` case_id: CASE_001 run_status: Failed failed_step: Condition_Check error: code=400, message=BadRequest, details=InvalidTemplate: The expression is invalid steps: - Compose_Ext (Succeeded, outputs: xlsx) - Condition_Check (Failed, expression: @equals(outputs('Compose_Ext'),'xlsx') attempts_left: 3 ``` --- ## What You Can Do (Action Space - Just Starting!) Right now, in this simple version, you can only do one type of action. You can submit a `patch_step` action. This action targets the `Condition_Check` step and updates its `inputs.expression` field. **Example action:** ``` action = patch_step step = Condition_Check field = inputs.expression value = @equals(outputs('Compose_Ext'),'xlsx') ``` For now, your fix needs to be an *exact* match to what's expected for it to count as correct. --- ## How You Get Graded (Reward Function) Our scoring system is pretty straightforward: - **+1.0** if you successfully fix the flow. - **-0.1** for trying an incorrect fix (but you still have tries left). - **-0.2** if you run out of tries without fixing it. The game (episode) ends when the flow is fixed, or when you run out of chances. --- ## The Problems (Dataset) The specific bugs we're trying to fix are stored in JSON files here: `flow_debugger_env/data/cases.json` Each problem includes the messed-up flow state, error details, and a hidden 'gold_fix' (the right answer) that the environment uses to check your work. You, the agent, never see this 'gold_fix'. --- ## How to Run the Example Just run the `demo.py` file from the main project folder like this: `python demo.py` The demo will pick a random bug, use a basic rule-based agent to try and fix the condition expression, and then show you how it went. --- ## What This Can't Do Yet (Limitations) This simple version is kept small on purpose: - It only deals with syntax errors in Condition expressions. - It doesn't actually run real Power Automate flows. - It doesn't connect to any outside services or APIs. - It's not doing fancy AI learning (like reinforcement learning) yet. Keeping things simple means it's fast, predictable, and easy for us to build on later. --- ## What's Next? We could add more cool stuff later, like: - Figuring out errors in 'filter array' settings. - Dealing with 'null' values or wrong data types. - Fixing multiple steps at once. - Using smarter, AI-powered agents. - Training AI using special tools like TRL or Unsloth. - Adding 'Green Agent' wrappers. --- ## Why We Made This Debugging Power Automate is a real headache for many, and it's a big deal. This environment turns those everyday automation failures into a structured task for agents and a useful testbed for learning and experimenting with OpenEnv.