--- language: - en license: mit tags: - reinforcement-learning - q-learning - game-ai - teeworlds - openenv library_name: custom pipeline_tag: reinforcement-learning model-index: - name: teeunit-agent results: - task: type: reinforcement-learning name: Game Playing dataset: type: custom name: TeeUnit Environment metrics: - type: reward value: 39.38 name: Total Reward (20 episodes) --- # TeeUnit Agent Trained RL agents for the [TeeUnit Environment](https://huggingface.co/spaces/ziadbc/teeunit-env) - an OpenEnv-compatible Teeworlds arena for LLM-based reinforcement learning. ## Environment - **Space**: [ziadbc/teeunit-env](https://huggingface.co/spaces/ziadbc/teeunit-env) - **GitHub**: [ziadgit/teeunit](https://github.com/ziadgit/teeunit) - **Game**: Teeworlds 0.7.5 arena (simulation mode) ## Available Models ### Q-Learning Agent (Latest) - **File**: `teeunit_qlearning_agent.json` / `teeunit_qlearning_agent.pkl` - **Algorithm**: Tabular Q-Learning - **Training**: 20 episodes, 938 steps - **Total Reward**: 39.38 ### Actions The agent can perform 7 actions: | Action | Description | |--------|-------------| | `move left` | Move character left | | `move right` | Move character right | | `move none` | Stop moving | | `jump` | Jump | | `shoot pistol` | Fire pistol (weapon 1) | | `shoot shotgun` | Fire shotgun (weapon 2) | | `hook` | Use grappling hook | ## Usage ### Load and Use the Agent ```python import json import random # Load model with open('teeunit_qlearning_agent.json') as f: model = json.load(f) q_table = model['q_table'] actions = model['actions'] def get_state_key(status_text): """Extract state from game status text.""" lines = status_text.split('\n') state = [] for line in lines: if 'Position:' in line: try: pos = line.split('(')[1].split(')')[0] x, y = map(float, pos.split(',')) state.append(f'pos_{int(x//100)}_{int(y//100)}') except: state.append('pos_unknown') if 'Health:' in line: try: health = int(line.split(':')[1].split('/')[0].strip()) state.append(f'hp_{health//3}') except: pass if 'units away' in line: try: dist = float(line.split(',')[-1].replace('units away', '').strip()) state.append(f'enemy_{"close" if dist < 100 else "mid" if dist < 200 else "far"}') except: pass return str(tuple(sorted(state))) if state else "('default',)" def choose_action(state_key): """Choose best action for given state.""" if state_key in q_table: q_values = q_table[state_key] best_action = max(q_values.keys(), key=lambda a: q_values[a]) return int(best_action) return random.randint(0, len(actions) - 1) # Example usage state_key = get_state_key(status_text) action_idx = choose_action(state_key) action = actions[action_idx] print(f"Action: {action['tool']} with args {action['args']}") ``` ### Connect to Environment ```python import asyncio import websockets import json async def play(): uri = 'wss://ziadbc-teeunit-env.hf.space/ws' async with websockets.connect(uri) as ws: # Reset environment await ws.send(json.dumps({'type': 'reset', 'data': {}})) await ws.recv() # Get status await ws.send(json.dumps({ 'type': 'step', 'data': {'type': 'call_tool', 'tool_name': 'get_status', 'arguments': {}} })) resp = json.loads(await ws.recv()) status = resp['data']['observation']['result']['data'] # Choose and execute action state_key = get_state_key(status) action = actions[choose_action(state_key)] await ws.send(json.dumps({ 'type': 'step', 'data': {'type': 'call_tool', 'tool_name': action['tool'], 'arguments': action['args']} })) resp = json.loads(await ws.recv()) reward = resp['data']['reward'] print(f"Reward: {reward}") asyncio.run(play()) ``` ## Training Your Own Agent See the [Colab notebook](https://github.com/ziadgit/teeunit/blob/main/notebooks/teeunit_training.ipynb) for training examples using: - **Q-Learning** (tabular) - **Stable Baselines3** (PPO, A2C) - **Unsloth/TRL** (LLM fine-tuning) ## Environment API The TeeUnit environment exposes these MCP tools: | Tool | Arguments | Description | |------|-----------|-------------| | `move` | `direction: "left"\|"right"\|"none"` | Move horizontally | | `jump` | - | Jump (can double-jump) | | `aim` | `x: int, y: int` | Aim at coordinates | | `shoot` | `weapon: 0-5` | Fire weapon | | `hook` | - | Toggle grappling hook | | `get_status` | - | Get game state as text | ## License MIT License - See [GitHub repo](https://github.com/ziadgit/teeunit) for details.