Qwen
/

Qwen-72B

Text Generation

Model card Files Files and versions

Resources

View closed (5)

为什么计算softmax之前要将logits转为float？

#10 opened almost 2 years ago by

how did you guys pretrain the tokenizer using tiktoken ?

#9 opened almost 2 years ago by

StephennFernandes

是否可以运行在两张不同型号的GPU上

#8 opened about 2 years ago by

Adding Evaluation Results

#7 opened about 2 years ago by

leaderboard-pr-bot

On how much English token was the model trained onn

#5 opened over 2 years ago by

_set_gradient_checkpointing() got an unexpected keyword argument 'enable'

#3 opened over 2 years ago by