nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
Text Generation • 124B • Updated • 171k • 310
device=["cuda:0", "cuda:1"] or device=["cpu"]*4 on the model.predict or model.rank calls.dataset_id, e.g. dataset_id="lightonai/NanoBEIR-de" for the German benchmark.output_scores=True to get similarity scores returned. This can be useful for some distillation losses!