Uruk Neural Renderer

A multi-model neural rendering pipeline designed for real-time game graphics, trained on NVIDIA B200 GPUs. The Uruk system uses a modular workstream architecture where specialized models handle different aspects of the rendering pipeline — from world modeling and scene remapping to cinematic rendering and runtime optimization.

Architecture Overview

The Uruk Neural Renderer is organized into a multi-workstream pipeline, where each workstream trains a specialized model family. A policy-compliant orchestrator manages the 4-stage training lifecycle: Smoke (bug-catching), Calibration (hyperparameter tuning), Production (full training with early stopping), and Distillation (teacher-to-student compression).

The flagship V2-Ultra model uses a 32-channel input (including a 12-channel G-buffer with material IDs, depth, and normals) and achieves 94.6% material accuracy — enabling physically-correct lighting decisions based on ground-truth geometry rather than screen-space inference. This architecture is designed to exceed DLSS 5 quality by relying on deterministic material accuracy rather than AI hallucinations.

Models

V2-Ultra (Neural Renderer V2)

The primary neural renderer with a 6.9M-parameter student model and 37.4M-parameter teacher. Trained with a 2-stage curriculum on the Frontier dataset.

File	Description	Size
`v2_ultra/v2_ultra_global_best.pt`	Global best checkpoint (student)	26.4 MB
`v2_ultra/v2_ultra_best_stage1.pt`	Best Stage 1 (foundation) checkpoint	79.3 MB
`v2_ultra/v2_ultra_best_stage2.pt`	Best Stage 2 (fine-tune) checkpoint	79.3 MB
`onnx/uruk_v2_ultra_best.onnx`	ONNX export for deployment	26.4 MB

V2-Optimized (Reconstruction-First Approach)

An improved training run that initializes from V1 weights and uses a 3-stage curriculum (rendering, material, optional GAN) with MS-SSIM loss and quality gates.

File	Description	Size
`v2_optimized/v2opt_global_best.pt`	Global best checkpoint	26.4 MB

Workstream Production Models

Best checkpoints from the policy-compliant orchestrator production runs.

File	Workstream	Description	Size
`workstreams/ws2_learned_world/ws2_best.pt`	WS2 — Learned World Model (Family D)	Learns world dynamics and physics	59.4 MB
`workstreams/ws3_world_authoring/ws3_best.pt`	WS3 — World Authoring (Family B)	Procedural world generation from embeddings	103.3 MB
`workstreams/ws4_world_remapper/ws4_best.pt`	WS4 — World Remapper (Family C)	Scene-to-scene transformation	259.3 MB
`workstreams/ws4_world_remapper_v2/ws4_v2_best.pt`	WS4 v2 — World Remapper (optimized rerun)	Completed all 500 epochs	259.3 MB
`workstreams/ws5_cinematic_renderer/ws5_frontier_best.pt`	WS5 — Cinematic Renderer (Family I)	Rich G-buffer rendering (19-ch input, 10.1M params)	116.2 MB
`workstreams/ws6_runtime_optimization/ws6_best.pt`	WS6 — Runtime Optimization (Family G)	Inference speed optimization	4.4 MB
`workstreams/ws6_runtime_optimization_v2/ws6_v2_best.pt`	WS6 v2 — Runtime Optimization (rerun)	Completed all 500 epochs	13.7 MB
`workstreams/ws7_scene_to_world_v2/ws7_v2_best.pt`	WS7 v2 — Scene to World	Graph-based scene understanding	43.7 MB

Distillation Students

Distilled student models compressed from the production teachers for optimal runtime performance.

File	Description	Size
`distillation/ws2_student_best.pt`	WS2 Learned World Model distilled student (best val_loss: 0.000375)	21.6 MB
`distillation/ws3_student_best.pt`	WS3 World Authoring distilled student (best val_loss: 0.000073)	38.4 MB
`distillation/ws4_student_best.pt`	WS4 World Remapper distilled student (best val_loss: 0.049)	90.6 MB
`distillation/ws5_student_best.pt`	WS5 Cinematic Renderer distilled student (best val_loss: 0.216)	40.6 MB
`distillation/ws6_student_best.pt`	WS6 Runtime Optimization distilled student (best val_loss: 0.000016)	1.76 MB

(Note: Final epoch checkpoints are also available in the repository as *_student_final.pt)

Additional Models

File	Description	Size
`npc_director_v2/best.pt`	NPC Director v2 — behavioral AI for NPC state management (99.57% state accuracy)	15.0 MB
`animation_director/best.pt`	Animation Director — procedural animation control	8.0 MB
`ws8_structure_generator/best_model.pt`	WS8 — Structure Generator	32.4 MB
`onnx/npc_director.onnx`	NPC Director ONNX export	5.1 MB

Training Infrastructure

All models were trained on a single NVIDIA B200 GPU (183 GB VRAM) on RunPod. The multi-workstream orchestrator managed sequential training across all workstreams with automatic stage transitions, early stopping, and plateau detection.

Usage

import torch

# Load a checkpoint
checkpoint = torch.load("v2_ultra/v2_ultra_global_best.pt", map_location="cpu")
model_state = checkpoint["model_state_dict"]

# For ONNX inference
import onnxruntime as ort
session = ort.InferenceSession("onnx/uruk_v2_ultra_best.onnx")

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support