File size: 2,322 Bytes
e539caf
 
 
 
 
 
 
 
 
 
 
b997527
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
title: MiniGridEnv Blog
emoji: 🐠
colorFrom: green
colorTo: pink
sdk: static
pinned: false
license: apache-2.0
short_description: Blog for MiniGridEnv for OpenEnv Comp in AgentX
---

# MiniGridEnv Blog

Static blog post for the OpenEnv track of the AgentX competition (UC Berkeley RDI), covering:

- An OpenEnv-native wrap of Farama's MiniGrid / BabyAI with text observations and NL actions.
- GRPO post-training (`MiniGridPT`) with **cross-episodic, LLM-rewritten, line-budgeted markdown memory**.
- **Branch-stable** memory-file naming so each GRPO chain keeps a stable file across optimizer steps.

## Files

- `index.html` β€” main blog (self-contained: inline CSS, Mermaid via CDN).
- `banner.png` β€” 3-panel hero image (Observe β†’ Act β†’ Remember).
- `style.css` β€” legacy placeholder from the Spaces scaffold; `index.html` inlines all styling.

## Rebuild the banner

The banner is generated from a matplotlib script kept with the other impl docs:

```bash
# from the repo root
python impl-context/build_blog_images.py
# writes MiniGridEnv_Blog/banner.png at 200 DPI
```

Dependencies: `pip install matplotlib numpy`.

## Open locally

```bash
open MiniGridEnv_Blog/index.html
# or: python -m http.server --directory MiniGridEnv_Blog 8080
```

## `<INSERT>` placeholders

The blog ships with a handful of `<INSERT: ...>` placeholders that must be filled before publishing:

- `<INSERT: GitHub URL>` β€” repo URL (hero badges, buttons, quickstart `git clone`, footer).
- `<INSERT: HF Space URL>` β€” live environment Space (topnav, hero buttons, footer).
- `<INSERT: Voyager arXiv URL>` / `<INSERT: Reflexion arXiv URL>` / `<INSERT: Generative Agents arXiv URL>` β€” arXiv links in the Foundations table (pre-filled paper IDs are in the surrounding text: `2305.16291`, `2303.11366`, `2304.03442`).
- `<INSERT: Lottery HF Space URL>` β€” sibling project Space in the Foundations table.
- `<INSERT>` cells in the Results table β€” measured completion rates for GRPO and GRPO+Memory per level once converged checkpoints are available.
- `<INSERT: verbatim memory snapshot per checkpoint>` β€” optional: replace the illustrative memory-evolution cards with verbatim snapshots after a memory-mode training run.

See the Spaces configuration reference at https://huggingface.co/docs/hub/spaces-config-reference.