Julian Bilcke
wip
5c50d1d

Matrix-Game 2.0 WebSocket Server

Project Overview

Matrix-Game 2.0 is a real-time interactive game world generation system that uses advanced generative video models to create explorable environments. This repository contains a WebSocket server wrapper that enables web-based interaction with the Matrix-Game 2.0 models.

Architecture

Core Components

  1. api_server.py - WebSocket server handling client connections and game sessions
  2. api_engine.py - Matrix-Game 2.0 model inference engine
  3. api_utils.py - Utility functions for image processing and visualization
  4. client/ - Web-based client interface for testing

Model Components

  • WAN Diffusion Model - Core generative model (14B parameters)
  • VAE Encoder/Decoder - For latent space encoding/decoding
  • Streaming Pipeline - Real-time frame generation
  • Condition Processing - Keyboard and mouse input handling

Key Features

  • Real-time video generation based on user inputs
  • Multiple game modes: Universal, GTA Drive, Temple Run
  • WebSocket-based streaming for low-latency interaction
  • Fallback mode for demo without GPU
  • Support for multiple concurrent sessions

Resolution and Performance

  • Standard resolution: 352x640
  • Target FPS: 16
  • Streaming generation: 5 frames per batch
  • Reduced latency through latent-space operations

Game Modes

  1. Universal - General exploration with full camera and movement control
  2. GTA Drive - Driving simulation mode
  3. Temple Run - Runner game mode with limited controls

Input Controls

Keyboard Controls

  • W/S/A/D - Movement (forward/back/left/right)
  • Space - Jump
  • Shift/Ctrl - Attack/Action

Mouse Controls

  • X/Y coordinates normalized to [-1, 1]
  • Camera rotation and view control

Model Loading

The system automatically downloads models from Hugging Face (Skywork/Matrix-Game-2.0) if not present locally. Models include:

  • Wan2.1_VAE.pth - VAE model weights
  • Generator checkpoint files
  • Configuration files for different modes

Deployment

Docker Deployment

docker build -t matrix-game-2 .
docker run -p 8080:8080 --gpus all matrix-game-2

Local Development

pip install -r requirements.txt
python api_server.py --host 0.0.0.0 --port 8080

Environment Variables

  • PORT - Server port (default: 8080)
  • SPACE_ID - Hugging Face Space ID (for HF deployment)
  • CUDA_VISIBLE_DEVICES - GPU selection

Testing

Access the web client at http://localhost:8080/ after starting the server.

Known Limitations

  • Requires NVIDIA GPU with 24GB+ VRAM for full model
  • Initial model loading takes 2-3 minutes

Updates from V1

  • New model architecture (WAN-based instead of DIT-based)
  • Streaming pipeline for better real-time performance
  • Improved condition handling for different game modes
  • Better memory efficiency through tiling
  • Simplified API structure