File size: 4,705 Bytes

---
license: mit
tags:
  - openenv
  - reinforcement-learning
  - agentic-ai
  - debugging
  - power-automate
  - automation
  - llm
library_name: openenv
task_categories:
  - reinforcement-learning
  - reasoning
  - debugging
datasets:
  - custom
metrics:
  - success-rate
language:
  - en
pretty_name: OpenEnv Flow Debugger
---


# OpenEnv Flow Debugger  
*A real-world agentic debugging environment for Power Automate*

This project is a small, easy-to-use debugging tool built with OpenEnv. It's inspired by those tricky real-world problems we hit in tools like Power Automate.

Our environment focuses on a super common issue: those annoying '400 BadRequest' errors that pop up when a condition in your automation flow has a syntax mistake.

The main idea here isn't to build a perfect smart agent right away. Instead, we want to create a clear, realistic, and expandable way to test and improve how agents fix bugs.

---

## What You Need to Do

Imagine you have a Power Automate Flow that just failed.

It failed because of an "HTTP 400 BadRequest" error.
This error happened in a "Condition" step.
And the condition expression has a tiny syntax error.

Your job as the agent is to fix that broken condition expression so the flow can run perfectly.

Each time you play (each "episode"), it's like facing a real-life debugging puzzle that automation engineers deal with all the time.

---

## What You See (Observation Space)

At each step, you'll get some info in a JSON-like format. It includes:

-   `case_id`: A unique ID for this specific problem.
-   `run_status`: Tells you if the flow is still 'Failed' or 'Succeeded'.
-   `failed_step`: Which step caused the problem.
-   `error`: Details about the error, like the code and a message.
-   `steps`: A list of all the steps in the flow, showing their inputs and outputs.
-   `attempts_left`: How many more tries you have to fix it.

**Example observation (kept simple):**

```
case_id: CASE_001
run_status: Failed
failed_step: Condition_Check
error: code=400, message=BadRequest, details=InvalidTemplate: The expression is invalid
steps:
- Compose_Ext (Succeeded, outputs: xlsx)
- Condition_Check (Failed, expression: @equals(outputs('Compose_Ext'),'xlsx')
attempts_left: 3
```

---

## What You Can Do (Action Space - Just Starting!)

Right now, in this simple version, you can only do one type of action.

You can submit a `patch_step` action. This action targets the `Condition_Check` step and updates its `inputs.expression` field.

**Example action:**

```
action = patch_step
step = Condition_Check
field = inputs.expression
value = @equals(outputs('Compose_Ext'),'xlsx')
```

For now, your fix needs to be an *exact* match to what's expected for it to count as correct.

---

## How You Get Graded (Reward Function)

Our scoring system is pretty straightforward:

-   **+1.0** if you successfully fix the flow.
-   **-0.1** for trying an incorrect fix (but you still have tries left).
-   **-0.2** if you run out of tries without fixing it.

The game (episode) ends when the flow is fixed, or when you run out of chances.

---

## The Problems (Dataset)

The specific bugs we're trying to fix are stored in JSON files here:

`flow_debugger_env/data/cases.json`

Each problem includes the messed-up flow state, error details, and a hidden 'gold_fix' (the right answer) that the environment uses to check your work. You, the agent, never see this 'gold_fix'.

---

## How to Run the Example

Just run the `demo.py` file from the main project folder like this:

`python demo.py`

The demo will pick a random bug, use a basic rule-based agent to try and fix the condition expression, and then show you how it went.

---

## What This Can't Do Yet (Limitations)

This simple version is kept small on purpose:

-   It only deals with syntax errors in Condition expressions.
-   It doesn't actually run real Power Automate flows.
-   It doesn't connect to any outside services or APIs.
-   It's not doing fancy AI learning (like reinforcement learning) yet.

Keeping things simple means it's fast, predictable, and easy for us to build on later.

---

## What's Next?

We could add more cool stuff later, like:

-   Figuring out errors in 'filter array' settings.
-   Dealing with 'null' values or wrong data types.
-   Fixing multiple steps at once.
-   Using smarter, AI-powered agents.
-   Training AI using special tools like TRL or Unsloth.
-   Adding 'Green Agent' wrappers.

---

## Why We Made This

Debugging Power Automate is a real headache for many, and it's a big deal. This environment turns those everyday automation failures into a structured task for agents and a useful testbed for learning and experimenting with OpenEnv.