Uruk Neural Renderer
A multi-model neural rendering pipeline designed for real-time game graphics, trained on NVIDIA B200 GPUs. The Uruk system uses a modular workstream architecture where specialized models handle different aspects of the rendering pipeline β from world modeling and scene remapping to cinematic rendering and runtime optimization.
Architecture Overview
The Uruk Neural Renderer is organized into a multi-workstream pipeline, where each workstream trains a specialized model family. A policy-compliant orchestrator manages the 4-stage training lifecycle: Smoke (bug-catching), Calibration (hyperparameter tuning), Production (full training with early stopping), and Distillation (teacher-to-student compression).
The flagship V2-Ultra model uses a 32-channel input (including a 12-channel G-buffer with material IDs, depth, and normals) and achieves 94.6% material accuracy β enabling physically-correct lighting decisions based on ground-truth geometry rather than screen-space inference. This architecture is designed to exceed DLSS 5 quality by relying on deterministic material accuracy rather than AI hallucinations.
Models
V2-Ultra (Neural Renderer V2)
The primary neural renderer with a 6.9M-parameter student model and 37.4M-parameter teacher. Trained with a 2-stage curriculum on the Frontier dataset.
| File | Description | Size |
|---|---|---|
v2_ultra/v2_ultra_global_best.pt |
Global best checkpoint (student) | 26.4 MB |
v2_ultra/v2_ultra_best_stage1.pt |
Best Stage 1 (foundation) checkpoint | 79.3 MB |
v2_ultra/v2_ultra_best_stage2.pt |
Best Stage 2 (fine-tune) checkpoint | 79.3 MB |
onnx/uruk_v2_ultra_best.onnx |
ONNX export for deployment | 26.4 MB |
V2-Optimized (Reconstruction-First Approach)
An improved training run that initializes from V1 weights and uses a 3-stage curriculum (rendering, material, optional GAN) with MS-SSIM loss and quality gates.
| File | Description | Size |
|---|---|---|
v2_optimized/v2opt_global_best.pt |
Global best checkpoint | 26.4 MB |
Workstream Production Models
Best checkpoints from the policy-compliant orchestrator production runs.
| File | Workstream | Description | Size |
|---|---|---|---|
workstreams/ws2_learned_world/ws2_best.pt |
WS2 β Learned World Model (Family D) | Learns world dynamics and physics | 59.4 MB |
workstreams/ws3_world_authoring/ws3_best.pt |
WS3 β World Authoring (Family B) | Procedural world generation from embeddings | 103.3 MB |
workstreams/ws4_world_remapper/ws4_best.pt |
WS4 β World Remapper (Family C) | Scene-to-scene transformation | 259.3 MB |
workstreams/ws4_world_remapper_v2/ws4_v2_best.pt |
WS4 v2 β World Remapper (optimized rerun) | Completed all 500 epochs | 259.3 MB |
workstreams/ws5_cinematic_renderer/ws5_frontier_best.pt |
WS5 β Cinematic Renderer (Family I) | Rich G-buffer rendering (19-ch input, 10.1M params) | 116.2 MB |
workstreams/ws6_runtime_optimization/ws6_best.pt |
WS6 β Runtime Optimization (Family G) | Inference speed optimization | 4.4 MB |
workstreams/ws6_runtime_optimization_v2/ws6_v2_best.pt |
WS6 v2 β Runtime Optimization (rerun) | Completed all 500 epochs | 13.7 MB |
workstreams/ws7_scene_to_world_v2/ws7_v2_best.pt |
WS7 v2 β Scene to World | Graph-based scene understanding | 43.7 MB |
Distillation Students
Distilled student models compressed from the production teachers for optimal runtime performance.
| File | Description | Size |
|---|---|---|
distillation/ws2_student_best.pt |
WS2 Learned World Model distilled student (best val_loss: 0.000375) | 21.6 MB |
distillation/ws3_student_best.pt |
WS3 World Authoring distilled student (best val_loss: 0.000073) | 38.4 MB |
distillation/ws4_student_best.pt |
WS4 World Remapper distilled student (best val_loss: 0.049) | 90.6 MB |
distillation/ws5_student_best.pt |
WS5 Cinematic Renderer distilled student (best val_loss: 0.216) | 40.6 MB |
distillation/ws6_student_best.pt |
WS6 Runtime Optimization distilled student (best val_loss: 0.000016) | 1.76 MB |
(Note: Final epoch checkpoints are also available in the repository as *_student_final.pt)
Additional Models
| File | Description | Size |
|---|---|---|
npc_director_v2/best.pt |
NPC Director v2 β behavioral AI for NPC state management (99.57% state accuracy) | 15.0 MB |
animation_director/best.pt |
Animation Director β procedural animation control | 8.0 MB |
ws8_structure_generator/best_model.pt |
WS8 β Structure Generator | 32.4 MB |
onnx/npc_director.onnx |
NPC Director ONNX export | 5.1 MB |
Training Infrastructure
All models were trained on a single NVIDIA B200 GPU (183 GB VRAM) on RunPod. The multi-workstream orchestrator managed sequential training across all workstreams with automatic stage transitions, early stopping, and plateau detection.
Usage
import torch
# Load a checkpoint
checkpoint = torch.load("v2_ultra/v2_ultra_global_best.pt", map_location="cpu")
model_state = checkpoint["model_state_dict"]
# For ONNX inference
import onnxruntime as ort
session = ort.InferenceSession("onnx/uruk_v2_ultra_best.onnx")
License
Apache 2.0