GRM / ref /Overview
mbagdasarova-nvidia's picture
Add GRM evaluation suite with scoring logic and benchmark registry
2a44234
raw
history blame
845 Bytes
Nvidia Game Ready Model Score (GRM) is an aggregated quality metric designed to assess LLM capabilites in gaming use cases.
General state-of-the-art language models are optimized for broad benchmarks such as math, code, and general knowledge. That does not reliably translate to in-game performance, and it does not reliably predict NPC quality, gameplay actions, or immersion.
With game model evaluation, game developers can accelerate AI integration pipelines by reducing time spent on model evaluation and narrowing model choice earlier. The overall score is the average of Roleplay, Actions, and General, while benchmarks inside each category are combined with weighted averaging using core weights of 1.0 and supplementary weights of 0.5.
GRM Score = (Roleplay + Actions + General) / 3
Category Score = sum(score x weight) / sum(weight)