Julian Bilcke
wip
5c50d1d
# Matrix-Game 2.0 WebSocket Server
## Project Overview
Matrix-Game 2.0 is a real-time interactive game world generation system that uses advanced generative video models to create explorable environments. This repository contains a WebSocket server wrapper that enables web-based interaction with the Matrix-Game 2.0 models.
## Architecture
### Core Components
1. **api_server.py** - WebSocket server handling client connections and game sessions
2. **api_engine.py** - Matrix-Game 2.0 model inference engine
3. **api_utils.py** - Utility functions for image processing and visualization
4. **client/** - Web-based client interface for testing
### Model Components
- **WAN Diffusion Model** - Core generative model (14B parameters)
- **VAE Encoder/Decoder** - For latent space encoding/decoding
- **Streaming Pipeline** - Real-time frame generation
- **Condition Processing** - Keyboard and mouse input handling
## Key Features
- Real-time video generation based on user inputs
- Multiple game modes: Universal, GTA Drive, Temple Run
- WebSocket-based streaming for low-latency interaction
- Fallback mode for demo without GPU
- Support for multiple concurrent sessions
## Resolution and Performance
- Standard resolution: 352x640
- Target FPS: 16
- Streaming generation: 5 frames per batch
- Reduced latency through latent-space operations
## Game Modes
1. **Universal** - General exploration with full camera and movement control
2. **GTA Drive** - Driving simulation mode
3. **Temple Run** - Runner game mode with limited controls
## Input Controls
### Keyboard Controls
- W/S/A/D - Movement (forward/back/left/right)
- Space - Jump
- Shift/Ctrl - Attack/Action
### Mouse Controls
- X/Y coordinates normalized to [-1, 1]
- Camera rotation and view control
## Model Loading
The system automatically downloads models from Hugging Face (Skywork/Matrix-Game-2.0) if not present locally. Models include:
- Wan2.1_VAE.pth - VAE model weights
- Generator checkpoint files
- Configuration files for different modes
## Deployment
### Docker Deployment
```bash
docker build -t matrix-game-2 .
docker run -p 8080:8080 --gpus all matrix-game-2
```
### Local Development
```bash
pip install -r requirements.txt
python api_server.py --host 0.0.0.0 --port 8080
```
## Environment Variables
- `PORT` - Server port (default: 8080)
- `SPACE_ID` - Hugging Face Space ID (for HF deployment)
- `CUDA_VISIBLE_DEVICES` - GPU selection
## Testing
Access the web client at `http://localhost:8080/` after starting the server.
## Known Limitations
- Requires NVIDIA GPU with 24GB+ VRAM for full model
- Initial model loading takes 2-3 minutes
## Updates from V1
- New model architecture (WAN-based instead of DIT-based)
- Streaming pipeline for better real-time performance
- Improved condition handling for different game modes
- Better memory efficiency through tiling
- Simplified API structure