teeunit-agent / README.md
ziadbc's picture
docs: add model card YAML metadata
1aa7daf
---
language:
- en
license: mit
tags:
- reinforcement-learning
- q-learning
- game-ai
- teeworlds
- openenv
library_name: custom
pipeline_tag: reinforcement-learning
model-index:
- name: teeunit-agent
results:
- task:
type: reinforcement-learning
name: Game Playing
dataset:
type: custom
name: TeeUnit Environment
metrics:
- type: reward
value: 39.38
name: Total Reward (20 episodes)
---
# TeeUnit Agent
Trained RL agents for the [TeeUnit Environment](https://huggingface.co/spaces/ziadbc/teeunit-env) - an OpenEnv-compatible Teeworlds arena for LLM-based reinforcement learning.
## Environment
- **Space**: [ziadbc/teeunit-env](https://huggingface.co/spaces/ziadbc/teeunit-env)
- **GitHub**: [ziadgit/teeunit](https://github.com/ziadgit/teeunit)
- **Game**: Teeworlds 0.7.5 arena (simulation mode)
## Available Models
### Q-Learning Agent (Latest)
- **File**: `teeunit_qlearning_agent.json` / `teeunit_qlearning_agent.pkl`
- **Algorithm**: Tabular Q-Learning
- **Training**: 20 episodes, 938 steps
- **Total Reward**: 39.38
### Actions
The agent can perform 7 actions:
| Action | Description |
|--------|-------------|
| `move left` | Move character left |
| `move right` | Move character right |
| `move none` | Stop moving |
| `jump` | Jump |
| `shoot pistol` | Fire pistol (weapon 1) |
| `shoot shotgun` | Fire shotgun (weapon 2) |
| `hook` | Use grappling hook |
## Usage
### Load and Use the Agent
```python
import json
import random
# Load model
with open('teeunit_qlearning_agent.json') as f:
model = json.load(f)
q_table = model['q_table']
actions = model['actions']
def get_state_key(status_text):
"""Extract state from game status text."""
lines = status_text.split('\n')
state = []
for line in lines:
if 'Position:' in line:
try:
pos = line.split('(')[1].split(')')[0]
x, y = map(float, pos.split(','))
state.append(f'pos_{int(x//100)}_{int(y//100)}')
except:
state.append('pos_unknown')
if 'Health:' in line:
try:
health = int(line.split(':')[1].split('/')[0].strip())
state.append(f'hp_{health//3}')
except:
pass
if 'units away' in line:
try:
dist = float(line.split(',')[-1].replace('units away', '').strip())
state.append(f'enemy_{"close" if dist < 100 else "mid" if dist < 200 else "far"}')
except:
pass
return str(tuple(sorted(state))) if state else "('default',)"
def choose_action(state_key):
"""Choose best action for given state."""
if state_key in q_table:
q_values = q_table[state_key]
best_action = max(q_values.keys(), key=lambda a: q_values[a])
return int(best_action)
return random.randint(0, len(actions) - 1)
# Example usage
state_key = get_state_key(status_text)
action_idx = choose_action(state_key)
action = actions[action_idx]
print(f"Action: {action['tool']} with args {action['args']}")
```
### Connect to Environment
```python
import asyncio
import websockets
import json
async def play():
uri = 'wss://ziadbc-teeunit-env.hf.space/ws'
async with websockets.connect(uri) as ws:
# Reset environment
await ws.send(json.dumps({'type': 'reset', 'data': {}}))
await ws.recv()
# Get status
await ws.send(json.dumps({
'type': 'step',
'data': {'type': 'call_tool', 'tool_name': 'get_status', 'arguments': {}}
}))
resp = json.loads(await ws.recv())
status = resp['data']['observation']['result']['data']
# Choose and execute action
state_key = get_state_key(status)
action = actions[choose_action(state_key)]
await ws.send(json.dumps({
'type': 'step',
'data': {'type': 'call_tool', 'tool_name': action['tool'], 'arguments': action['args']}
}))
resp = json.loads(await ws.recv())
reward = resp['data']['reward']
print(f"Reward: {reward}")
asyncio.run(play())
```
## Training Your Own Agent
See the [Colab notebook](https://github.com/ziadgit/teeunit/blob/main/notebooks/teeunit_training.ipynb) for training examples using:
- **Q-Learning** (tabular)
- **Stable Baselines3** (PPO, A2C)
- **Unsloth/TRL** (LLM fine-tuning)
## Environment API
The TeeUnit environment exposes these MCP tools:
| Tool | Arguments | Description |
|------|-----------|-------------|
| `move` | `direction: "left"\|"right"\|"none"` | Move horizontally |
| `jump` | - | Jump (can double-jump) |
| `aim` | `x: int, y: int` | Aim at coordinates |
| `shoot` | `weapon: 0-5` | Fire weapon |
| `hook` | - | Toggle grappling hook |
| `get_status` | - | Get game state as text |
## License
MIT License - See [GitHub repo](https://github.com/ziadgit/teeunit) for details.