---
license: apache-2.0
language:
- en
base_model:
- nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
---

<p align="center">
  <img src="preview-banner.png" alt="Nemotron Slide" width="100%">
</p>

# NemoSlides, a Nemotron Specialized in Slide Generation


**NemoSlides** is a post-trained hybrid architecture language model built on [NVIDIA-Nemotron-3-Nano-30B-A3B-BF16](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16) by NVIDIA Corporation. It underwent supervised fine-tuning (SFT) using [Nemo Automodel](https://github.com/NVIDIA-NeMo/Automodel).

**NemoSlides** is purpose-built to generate high-quality, aesthetic slides from a single instruction.

---
## Model Summary

| Property | Value |
|---|---|
| **Base Model** | [NVIDIA-Nemotron-3-Nano-30B-A3B-BF16](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16) |
| **Total Parameters** | 30B |
| **Active Parameters** | 3B |
| **Architecture** | Hybrid (Attention + SSM + MoE) |
| **Precision** | bf16 |
| **License** | Apache 2.0 |

---

## Evaluation Results

To evaluate the outcome we use [Gemini 3 Flash](https://deepmind.google/models/gemini/flash/) as a VLM judge. Our final model achieves a +48% improvement over the Nano baseline.

<p align="center">
  <img src="overall_bar.png" alt="Evaluation Result" width="100%">
</p>

--- 
## QuickStart

### Installation

```bash
pip install transformers torch
```

### Using Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "trillionlabs/NemoSlides"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Create a 9-slide Slidev deck for Apex Materials Group's board of directors reviewing FY24 capital allocation and dividend policy."},
]

input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

output = model.generate(input_ids, max_new_tokens=4096, do_sample=True, temperature=0.7)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
```

## Deployment

We recommend deploying the model with the lastest version of [vLLM](https://github.com/vllm-project/vllm).

```bash
wget https://huggingface.co/trillionlabs/NemoSlides/blob/main/nano_v3_reasoning_parser.py

vllm serve trillionlabs/NemoSlides \
  --tensor-parallel-size 1 \
  --port 8000 \
  --trust-remote-code \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder \
  --reasoning-parser-plugin nano_v3_reasoning_parser.py \
  --reasoning-parser nano_v3
```

---
## Rendering Slides

We use [Slidev](https://sli.dev/) to generate slides. Please check the official [repo](https://github.com/trillion-labs/nemoslides/tree/main/assets/renderer) to render the output into slide.

---
## License
This model is released under the Apache 2.0 License.

---
## Acknowledgement

This project is conducted as part of NVIDIA Nemotron Developer Days Seoul 2026 Hackathon. We thank NVIDIA for the oppurtunity and support.