Akarztrk's picture
Update README.md
101dd5a verified
---
license: mit
tags:
- openenv
- reinforcement-learning
- agentic-ai
- debugging
- power-automate
- automation
- llm
library_name: openenv
task_categories:
- reinforcement-learning
- reasoning
- debugging
datasets:
- custom
metrics:
- success-rate
language:
- en
pretty_name: OpenEnv Flow Debugger
---
# OpenEnv Flow Debugger
*A real-world agentic debugging environment for Power Automate*
This project is a small, easy-to-use debugging tool built with OpenEnv. It's inspired by those tricky real-world problems we hit in tools like Power Automate.
Our environment focuses on a super common issue: those annoying '400 BadRequest' errors that pop up when a condition in your automation flow has a syntax mistake.
The main idea here isn't to build a perfect smart agent right away. Instead, we want to create a clear, realistic, and expandable way to test and improve how agents fix bugs.
---
## What You Need to Do
Imagine you have a Power Automate Flow that just failed.
It failed because of an "HTTP 400 BadRequest" error.
This error happened in a "Condition" step.
And the condition expression has a tiny syntax error.
Your job as the agent is to fix that broken condition expression so the flow can run perfectly.
Each time you play (each "episode"), it's like facing a real-life debugging puzzle that automation engineers deal with all the time.
---
## What You See (Observation Space)
At each step, you'll get some info in a JSON-like format. It includes:
- `case_id`: A unique ID for this specific problem.
- `run_status`: Tells you if the flow is still 'Failed' or 'Succeeded'.
- `failed_step`: Which step caused the problem.
- `error`: Details about the error, like the code and a message.
- `steps`: A list of all the steps in the flow, showing their inputs and outputs.
- `attempts_left`: How many more tries you have to fix it.
**Example observation (kept simple):**
```
case_id: CASE_001
run_status: Failed
failed_step: Condition_Check
error: code=400, message=BadRequest, details=InvalidTemplate: The expression is invalid
steps:
- Compose_Ext (Succeeded, outputs: xlsx)
- Condition_Check (Failed, expression: @equals(outputs('Compose_Ext'),'xlsx')
attempts_left: 3
```
---
## What You Can Do (Action Space - Just Starting!)
Right now, in this simple version, you can only do one type of action.
You can submit a `patch_step` action. This action targets the `Condition_Check` step and updates its `inputs.expression` field.
**Example action:**
```
action = patch_step
step = Condition_Check
field = inputs.expression
value = @equals(outputs('Compose_Ext'),'xlsx')
```
For now, your fix needs to be an *exact* match to what's expected for it to count as correct.
---
## How You Get Graded (Reward Function)
Our scoring system is pretty straightforward:
- **+1.0** if you successfully fix the flow.
- **-0.1** for trying an incorrect fix (but you still have tries left).
- **-0.2** if you run out of tries without fixing it.
The game (episode) ends when the flow is fixed, or when you run out of chances.
---
## The Problems (Dataset)
The specific bugs we're trying to fix are stored in JSON files here:
`flow_debugger_env/data/cases.json`
Each problem includes the messed-up flow state, error details, and a hidden 'gold_fix' (the right answer) that the environment uses to check your work. You, the agent, never see this 'gold_fix'.
---
## How to Run the Example
Just run the `demo.py` file from the main project folder like this:
`python demo.py`
The demo will pick a random bug, use a basic rule-based agent to try and fix the condition expression, and then show you how it went.
---
## What This Can't Do Yet (Limitations)
This simple version is kept small on purpose:
- It only deals with syntax errors in Condition expressions.
- It doesn't actually run real Power Automate flows.
- It doesn't connect to any outside services or APIs.
- It's not doing fancy AI learning (like reinforcement learning) yet.
Keeping things simple means it's fast, predictable, and easy for us to build on later.
---
## What's Next?
We could add more cool stuff later, like:
- Figuring out errors in 'filter array' settings.
- Dealing with 'null' values or wrong data types.
- Fixing multiple steps at once.
- Using smarter, AI-powered agents.
- Training AI using special tools like TRL or Unsloth.
- Adding 'Green Agent' wrappers.
---
## Why We Made This
Debugging Power Automate is a real headache for many, and it's a big deal. This environment turns those everyday automation failures into a structured task for agents and a useful testbed for learning and experimenting with OpenEnv.