File size: 4,974 Bytes
1aa7daf 868f2dd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 | ---
language:
- en
license: mit
tags:
- reinforcement-learning
- q-learning
- game-ai
- teeworlds
- openenv
library_name: custom
pipeline_tag: reinforcement-learning
model-index:
- name: teeunit-agent
results:
- task:
type: reinforcement-learning
name: Game Playing
dataset:
type: custom
name: TeeUnit Environment
metrics:
- type: reward
value: 39.38
name: Total Reward (20 episodes)
---
# TeeUnit Agent
Trained RL agents for the [TeeUnit Environment](https://huggingface.co/spaces/ziadbc/teeunit-env) - an OpenEnv-compatible Teeworlds arena for LLM-based reinforcement learning.
## Environment
- **Space**: [ziadbc/teeunit-env](https://huggingface.co/spaces/ziadbc/teeunit-env)
- **GitHub**: [ziadgit/teeunit](https://github.com/ziadgit/teeunit)
- **Game**: Teeworlds 0.7.5 arena (simulation mode)
## Available Models
### Q-Learning Agent (Latest)
- **File**: `teeunit_qlearning_agent.json` / `teeunit_qlearning_agent.pkl`
- **Algorithm**: Tabular Q-Learning
- **Training**: 20 episodes, 938 steps
- **Total Reward**: 39.38
### Actions
The agent can perform 7 actions:
| Action | Description |
|--------|-------------|
| `move left` | Move character left |
| `move right` | Move character right |
| `move none` | Stop moving |
| `jump` | Jump |
| `shoot pistol` | Fire pistol (weapon 1) |
| `shoot shotgun` | Fire shotgun (weapon 2) |
| `hook` | Use grappling hook |
## Usage
### Load and Use the Agent
```python
import json
import random
# Load model
with open('teeunit_qlearning_agent.json') as f:
model = json.load(f)
q_table = model['q_table']
actions = model['actions']
def get_state_key(status_text):
"""Extract state from game status text."""
lines = status_text.split('\n')
state = []
for line in lines:
if 'Position:' in line:
try:
pos = line.split('(')[1].split(')')[0]
x, y = map(float, pos.split(','))
state.append(f'pos_{int(x//100)}_{int(y//100)}')
except:
state.append('pos_unknown')
if 'Health:' in line:
try:
health = int(line.split(':')[1].split('/')[0].strip())
state.append(f'hp_{health//3}')
except:
pass
if 'units away' in line:
try:
dist = float(line.split(',')[-1].replace('units away', '').strip())
state.append(f'enemy_{"close" if dist < 100 else "mid" if dist < 200 else "far"}')
except:
pass
return str(tuple(sorted(state))) if state else "('default',)"
def choose_action(state_key):
"""Choose best action for given state."""
if state_key in q_table:
q_values = q_table[state_key]
best_action = max(q_values.keys(), key=lambda a: q_values[a])
return int(best_action)
return random.randint(0, len(actions) - 1)
# Example usage
state_key = get_state_key(status_text)
action_idx = choose_action(state_key)
action = actions[action_idx]
print(f"Action: {action['tool']} with args {action['args']}")
```
### Connect to Environment
```python
import asyncio
import websockets
import json
async def play():
uri = 'wss://ziadbc-teeunit-env.hf.space/ws'
async with websockets.connect(uri) as ws:
# Reset environment
await ws.send(json.dumps({'type': 'reset', 'data': {}}))
await ws.recv()
# Get status
await ws.send(json.dumps({
'type': 'step',
'data': {'type': 'call_tool', 'tool_name': 'get_status', 'arguments': {}}
}))
resp = json.loads(await ws.recv())
status = resp['data']['observation']['result']['data']
# Choose and execute action
state_key = get_state_key(status)
action = actions[choose_action(state_key)]
await ws.send(json.dumps({
'type': 'step',
'data': {'type': 'call_tool', 'tool_name': action['tool'], 'arguments': action['args']}
}))
resp = json.loads(await ws.recv())
reward = resp['data']['reward']
print(f"Reward: {reward}")
asyncio.run(play())
```
## Training Your Own Agent
See the [Colab notebook](https://github.com/ziadgit/teeunit/blob/main/notebooks/teeunit_training.ipynb) for training examples using:
- **Q-Learning** (tabular)
- **Stable Baselines3** (PPO, A2C)
- **Unsloth/TRL** (LLM fine-tuning)
## Environment API
The TeeUnit environment exposes these MCP tools:
| Tool | Arguments | Description |
|------|-----------|-------------|
| `move` | `direction: "left"\|"right"\|"none"` | Move horizontally |
| `jump` | - | Jump (can double-jump) |
| `aim` | `x: int, y: int` | Aim at coordinates |
| `shoot` | `weapon: 0-5` | Fire weapon |
| `hook` | - | Toggle grappling hook |
| `get_status` | - | Get game state as text |
## License
MIT License - See [GitHub repo](https://github.com/ziadgit/teeunit) for details.
|