miike-ai
/

LeanLlama-8B

Model card Files Files and versions

16.1 GB

Ctrl+K

Ctrl+K

1 contributor

History: 4 commits

miike-ai's picture

Add 128K validation results and chunked prefill usage example

fa4671c verified 5 months ago

.gitattributes

1.57 kB
Initial upload: LeanLlama-8B with KV cache compression 5 months ago
README.md

3.37 kB
Add 128K validation results and chunked prefill usage example 5 months ago
chat_template.jinja

4.61 kB
Initial upload: LeanLlama-8B with KV cache compression 5 months ago
compression_config.json

451 Bytes
Initial upload: LeanLlama-8B with KV cache compression 5 months ago
config.json

1.7 kB
Initial upload: LeanLlama-8B with KV cache compression 5 months ago
generation_config.json

183 Bytes
Initial upload: LeanLlama-8B with KV cache compression 5 months ago
model.safetensors

16.1 GB
xet

Initial upload: LeanLlama-8B with KV cache compression 5 months ago
modeling_lean_llama.py

11.9 kB
Fix compression to handle all new tokens (chunked prefill support) 5 months ago
tokenizer.json

17.2 MB
xet

Initial upload: LeanLlama-8B with KV cache compression 5 months ago
tokenizer_config.json

296 Bytes
Initial upload: LeanLlama-8B with KV cache compression 5 months ago