请问scale_emb的作用是什么?
#11 opened about 4 hours ago
by
nero2023
slower than qwen 2.5 on a100 40gb
#10 opened 8 months ago
by
ambivalent02
The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.
1
#9 opened 11 months ago
by
ctranslate2-4you
Add link to paper
#8 opened 11 months ago
by
nielsr