aws-neuron
/

optimum-neuron-cache

Model card Files Files and versions

optimum-neuron-cache / inference-cache-config

16 kB

Ctrl+K

Ctrl+K

5 contributors

History: 64 commits

Jingya's picture

Jingya HF Staff

add pixart and remove deprecated

e5f06c7 verified 10 months ago

diffusion.json

2.31 kB
add pixart and remove deprecated 10 months ago
gpt2.json

398 Bytes
Add more gpt2 configurations about 2 years ago
granite.json

1.3 kB
Add configuration for granite models over 1 year ago
llama-variants.json

1.45 kB
Add DeepSeek distilled versions of LLama 8B about 1 year ago
llama.json

1.84 kB
Added TinyLlama as requested by Jim burtoft 12 months ago
llama2-70b.json

287 Bytes
Create llama2-70b.json almost 2 years ago
llama3-70b.json

584 Bytes
Add DeepSeek distilled model about 1 year ago
llama3.1-70b.json

289 Bytes
Rename inference-cache-config/Llama3.1-70b.json to inference-cache-config/llama3.1-70b.json over 1 year ago
mistral-variants.json

1.04 kB
Remove obsolete mistral variants over 1 year ago
mistral.json

1.87 kB
Update inference-cache-config/mistral.json over 1 year ago
mixtral.json

583 Bytes
Update inference-cache-config/mixtral.json over 1 year ago
phi4.json

556 Bytes
Add phi4 cached configurations about 1 year ago
qwen2.5-large.json

849 Bytes
Update inference-cache-config/qwen2.5-large.json about 1 year ago
qwen2.5.json

2.69 kB
Add DeepSeek distilled models about 1 year ago