Uruk Neural Renderer

A multi-model neural rendering pipeline designed for real-time game graphics, trained on NVIDIA B200 GPUs. The Uruk system uses a modular workstream architecture where specialized models handle different aspects of the rendering pipeline β€” from world modeling and scene remapping to cinematic rendering and runtime optimization.

Architecture Overview

The Uruk Neural Renderer is organized into a multi-workstream pipeline, where each workstream trains a specialized model family. A policy-compliant orchestrator manages the 4-stage training lifecycle: Smoke (bug-catching), Calibration (hyperparameter tuning), Production (full training with early stopping), and Distillation (teacher-to-student compression).

The flagship V2-Ultra model uses a 32-channel input (including a 12-channel G-buffer with material IDs, depth, and normals) and achieves 94.6% material accuracy β€” enabling physically-correct lighting decisions based on ground-truth geometry rather than screen-space inference. This architecture is designed to exceed DLSS 5 quality by relying on deterministic material accuracy rather than AI hallucinations.

Models

V2-Ultra (Neural Renderer V2)

The primary neural renderer with a 6.9M-parameter student model and 37.4M-parameter teacher. Trained with a 2-stage curriculum on the Frontier dataset.

File Description Size
v2_ultra/v2_ultra_global_best.pt Global best checkpoint (student) 26.4 MB
v2_ultra/v2_ultra_best_stage1.pt Best Stage 1 (foundation) checkpoint 79.3 MB
v2_ultra/v2_ultra_best_stage2.pt Best Stage 2 (fine-tune) checkpoint 79.3 MB
onnx/uruk_v2_ultra_best.onnx ONNX export for deployment 26.4 MB

V2-Optimized (Reconstruction-First Approach)

An improved training run that initializes from V1 weights and uses a 3-stage curriculum (rendering, material, optional GAN) with MS-SSIM loss and quality gates.

File Description Size
v2_optimized/v2opt_global_best.pt Global best checkpoint 26.4 MB

Workstream Production Models

Best checkpoints from the policy-compliant orchestrator production runs.

File Workstream Description Size
workstreams/ws2_learned_world/ws2_best.pt WS2 β€” Learned World Model (Family D) Learns world dynamics and physics 59.4 MB
workstreams/ws3_world_authoring/ws3_best.pt WS3 β€” World Authoring (Family B) Procedural world generation from embeddings 103.3 MB
workstreams/ws4_world_remapper/ws4_best.pt WS4 β€” World Remapper (Family C) Scene-to-scene transformation 259.3 MB
workstreams/ws4_world_remapper_v2/ws4_v2_best.pt WS4 v2 β€” World Remapper (optimized rerun) Completed all 500 epochs 259.3 MB
workstreams/ws5_cinematic_renderer/ws5_frontier_best.pt WS5 β€” Cinematic Renderer (Family I) Rich G-buffer rendering (19-ch input, 10.1M params) 116.2 MB
workstreams/ws6_runtime_optimization/ws6_best.pt WS6 β€” Runtime Optimization (Family G) Inference speed optimization 4.4 MB
workstreams/ws6_runtime_optimization_v2/ws6_v2_best.pt WS6 v2 β€” Runtime Optimization (rerun) Completed all 500 epochs 13.7 MB
workstreams/ws7_scene_to_world_v2/ws7_v2_best.pt WS7 v2 β€” Scene to World Graph-based scene understanding 43.7 MB

Distillation Students

Distilled student models compressed from the production teachers for optimal runtime performance.

File Description Size
distillation/ws2_student_best.pt WS2 Learned World Model distilled student (best val_loss: 0.000375) 21.6 MB
distillation/ws3_student_best.pt WS3 World Authoring distilled student (best val_loss: 0.000073) 38.4 MB
distillation/ws4_student_best.pt WS4 World Remapper distilled student (best val_loss: 0.049) 90.6 MB
distillation/ws5_student_best.pt WS5 Cinematic Renderer distilled student (best val_loss: 0.216) 40.6 MB
distillation/ws6_student_best.pt WS6 Runtime Optimization distilled student (best val_loss: 0.000016) 1.76 MB

(Note: Final epoch checkpoints are also available in the repository as *_student_final.pt)

Additional Models

File Description Size
npc_director_v2/best.pt NPC Director v2 β€” behavioral AI for NPC state management (99.57% state accuracy) 15.0 MB
animation_director/best.pt Animation Director β€” procedural animation control 8.0 MB
ws8_structure_generator/best_model.pt WS8 β€” Structure Generator 32.4 MB
onnx/npc_director.onnx NPC Director ONNX export 5.1 MB

Training Infrastructure

All models were trained on a single NVIDIA B200 GPU (183 GB VRAM) on RunPod. The multi-workstream orchestrator managed sequential training across all workstreams with automatic stage transitions, early stopping, and plateau detection.

Usage

import torch

# Load a checkpoint
checkpoint = torch.load("v2_ultra/v2_ultra_global_best.pt", map_location="cpu")
model_state = checkpoint["model_state_dict"]

# For ONNX inference
import onnxruntime as ort
session = ort.InferenceSession("onnx/uruk_v2_ultra_best.onnx")

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support