Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
aws-neuron
/
optimum-neuron-cache
like
32
Follow
AWS Inferentia and Trainium
176
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
682
Copy to bucket
new
395ffdd
optimum-neuron-cache
/
inference-cache-config
16 kB
Ctrl+K
Ctrl+K
5 contributors
History:
64 commits
Jingya
HF Staff
add pixart and remove deprecated
e5f06c7
verified
11 months ago
diffusion.json
2.31 kB
add pixart and remove deprecated
11 months ago
gpt2.json
398 Bytes
Add more gpt2 configurations
about 2 years ago
granite.json
1.3 kB
Add configuration for granite models
over 1 year ago
llama-variants.json
1.45 kB
Add DeepSeek distilled versions of LLama 8B
over 1 year ago
llama.json
1.84 kB
Added TinyLlama as requested by Jim burtoft
about 1 year ago
llama2-70b.json
287 Bytes
Create llama2-70b.json
almost 2 years ago
llama3-70b.json
584 Bytes
Add DeepSeek distilled model
over 1 year ago
llama3.1-70b.json
289 Bytes
Rename inference-cache-config/Llama3.1-70b.json to inference-cache-config/llama3.1-70b.json
over 1 year ago
mistral-variants.json
1.04 kB
Remove obsolete mistral variants
over 1 year ago
mistral.json
1.87 kB
Update inference-cache-config/mistral.json
over 1 year ago
mixtral.json
583 Bytes
Update inference-cache-config/mixtral.json
over 1 year ago
phi4.json
556 Bytes
Add phi4 cached configurations
about 1 year ago
qwen2.5-large.json
849 Bytes
Update inference-cache-config/qwen2.5-large.json
over 1 year ago
qwen2.5.json
2.69 kB
Add DeepSeek distilled models
over 1 year ago