--- license: apache-2.0 language: - en base_model: - nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 ---

Nemotron Slide

# NemoSlides, a Nemotron Specialized in Slide Generation **NemoSlides** is a post-trained hybrid architecture language model built on [NVIDIA-Nemotron-3-Nano-30B-A3B-BF16](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16) by NVIDIA Corporation. It underwent supervised fine-tuning (SFT) using [Nemo Automodel](https://github.com/NVIDIA-NeMo/Automodel). **NemoSlides** is purpose-built to generate high-quality, aesthetic slides from a single instruction. --- ## Model Summary | Property | Value | |---|---| | **Base Model** | [NVIDIA-Nemotron-3-Nano-30B-A3B-BF16](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16) | | **Total Parameters** | 30B | | **Active Parameters** | 3B | | **Architecture** | Hybrid (Attention + SSM + MoE) | | **Precision** | bf16 | | **License** | Apache 2.0 | --- ## Evaluation Results To evaluate the outcome we use [Gemini 3 Flash](https://deepmind.google/models/gemini/flash/) as a VLM judge. Our final model achieves a +48% improvement over the Nano baseline.

Evaluation Result

--- ## QuickStart ### Installation ```bash pip install transformers torch ``` ### Using Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "trillionlabs/NemoSlides" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Create a 9-slide Slidev deck for Apex Materials Group's board of directors reviewing FY24 capital allocation and dividend policy."}, ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) output = model.generate(input_ids, max_new_tokens=4096, do_sample=True, temperature=0.7) print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True)) ``` ## Deployment We recommend deploying the model with the lastest version of [vLLM](https://github.com/vllm-project/vllm). ```bash wget https://huggingface.co/trillionlabs/NemoSlides/blob/main/nano_v3_reasoning_parser.py vllm serve trillionlabs/NemoSlides \ --tensor-parallel-size 1 \ --port 8000 \ --trust-remote-code \ --enable-auto-tool-choice \ --tool-call-parser qwen3_coder \ --reasoning-parser-plugin nano_v3_reasoning_parser.py \ --reasoning-parser nano_v3 ``` --- ## Rendering Slides We use [Slidev](https://sli.dev/) to generate slides. Please check the official [repo](https://github.com/trillion-labs/nemoslides/tree/main/assets/renderer) to render the output into slide. --- ## License This model is released under the Apache 2.0 License. --- ## Acknowledgement This project is conducted as part of NVIDIA Nemotron Developer Days Seoul 2026 Hackathon. We thank NVIDIA for the oppurtunity and support.