Spaces:
Build error
Build error
File size: 3,280 Bytes
a8d4cdf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | ---
title: GarbageBot — RL Control Center
emoji: 🗑️
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
tags:
- openenv
- robotics
- reinforcement-learning
- llama-3.2
---
# 🤖 Garbage Collecting Robot — OpenEnv
An OpenEnv-compliant reinforcement learning environment for a garbage collecting robot. The agent must navigate a grid room to pick up garbage while managing battery constraints and storage capacity.
## Why Garbage Collection?
Autonomous garbage collection is a classic robotics challenge involving pathfinding, resource management (battery), and state management (storage capacity). This environment provides a realistic training ground for AI agents to learn:
- **Optimal Navigation** — shortest paths via BFS and Q-Learning.
- **Resource Management** — returning to base for charging before battery depletion.
- **Logistics** — managing a 6-unit storage bin and prioritizing unload cycles.
---
## Architecture
The environment is a discrete grid world where the robot interacts with garbage, obstacles, a charging station (Home), and an Unload Station.
```
┌──────────┐
│ Dashboard│ (FastAPI + Vanilla JS)
└─────┬────┘
▼
┌──────────┐
│ API │ (app.py)
└─────┬────┘
▼
┌──────────┐
│ Env Logic│ (environment.py)
└──────────┘
```
---
## Tasks
| Task ID | Difficulty | Description | Grid Size |
|---------|-----------|-------------|-----------|
| `task_easy` | 🟢 Easy | Small 5x5 grid, 1 piece of garbage. | 5x5 |
| `task_medium` | 🟡 Medium | 7x7 grid with obstacles, 3 pieces of garbage. | 7x7 |
| `task_hard` | 🔴 Hard | 10x10 maze, 5 pieces of garbage, strict battery. | 10x10 |
---
## Action Space
Movement and interaction commands:
- `UP`, `DOWN`, `LEFT`, `RIGHT`: Move the robot one cell.
- `COLLECT`: Pick up garbage if the robot is on its cell.
---
## Observation Space
The environment returns a detailed state:
- `robot_position`: `(x, y)`
- `garbage_positions`: List of `(x, y)`
- `battery_level`: Current battery vs max.
- `current_storage_load`: Current items vs capacity (6).
- `robot_mode`: `normal`, `recharging`, or `unloading`.
---
## Policy Priority Chain
Decisions can be driven by:
1. **Q-Learning Table** — pre-trained optimal policy.
2. **Llama-3.2-3B-Instruct** — fine-tuned LLM policy.
3. **BFS Heuristic** — reliable fallback pathfinding.
---
## Local Development
```bash
# 1. Install dependencies
pip install -r requirements.txt
# 2. Start the server
uvicorn app:app --host 0.0.0.0 --port 7860
# 3. Training
python qlearning.py --train --episodes 10000
```
---
## Project Structure
```
├── app.py # FastAPI server
├── environment.py # Core RL logic
├── models.py # Data schemas
├── scenarios.py # Task definitions
├── qlearning.py # Tabular RL training
├── inference.py # Policy resolver
├── frontend/ # Dashboard HTML/CSS/JS
├── qtable.json # Trained policy weights
├── Dockerfile # Deployment container
└── README.md # This file
```
|