mancub
/

NitroGen

Model card Files Files and versions

NitroGen / EXPLAINABILITY.md

mancub's picture

Duplicate from nvidia/NitroGen

3fca35c verified 2 days ago

|

history blame contribute delete

2.43 kB

Field	Response
Intended Task/Domain:	Vision-to-action model designed to play video games directly from raw frames
Model Type:	Transformer
Intended Users:	Researchers, game developers, open source community, gamers. Potential applications include next-generation game AI, automating testing for video games, and generally advancing research in embodied AI.
Output:	Gamepad actions
Describe how the model works:	Image inputs are encoded with a vision transformer. A separate diffusion transformer is conditioned on the image embeddings, which then denoise an action tensor
Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of:	Not Applicable
Technical Limitations & Mitigation:	This model performs well on games played with a gamepad. Model may not perform well on games played with a keyboard or mouse.
Verified to have met prescribed NVIDIA quality standards:	Yes
Performance Metrics:	Task success rate
Potential Known Risks:	The model may occasionally lose at certain games.
Licensing:	Governing Terms: NVIDIA License. Additional Information: Apache License for https://huggingface.co/google/siglip2-base-patch16-224.