TitanML
/

llama2-70b-chat-4bit-AWQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

llama2-70b-chat-4bit-AWQ

File size: 90 Bytes

6e0604d

{
    "zero_point": true,
    "q_group_size": 128,
    "w_bit": 4,
    "version": "GEMM"
}