Text Generation
Transformers
Safetensors
PyTorch
nvidia
nemotron-h
conversational

Correction: modeling_nemotron_h.py

#6
by JennBing - opened

line 625: B = B.repeat(1, 1, self.num_heads // self.n_groups, 1)
line 626: C = C.repeat(1, 1, self.num_heads // self.n_groups, 1)

Should be -
line 625: B = torch.repeat_interleave(B, self.num_heads // self.n_groups, dim=2)
line 626: C = torch.repeat_interleave(C, self.num_heads // self.n_groups, dim=2)

Sign up or log in to comment