elinas
/

llama-30b-int4

Text Generation

text-generation-inference

Model card Files Files and versions

Resources

View closed (1)

I'm getting 0.4 tokens/s on a 4090.

#2 opened almost 3 years ago by