Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
anthonym21
/
Mistral-7B-v0.3-CoDA-GQA-L
like
0
Text Generation
Transformers
Safetensors
PyTorch
English
mistral
attention
differential-attention
bounded-memory
kv-cache
landmark
coda-gqa-l
text-generation-inference
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
Mistral-7B-v0.3-CoDA-GQA-L
/
generation_config.json
anthonym21
Mistral 7B v0.3 + CoDA-GQA-L: two-phase trained (unbounded + bounded)
a1acae5
verified
2 months ago
raw
Copy download link
history
blame
contribute
delete
Safe
110 Bytes
{
"_from_model_config"
:
true
,
"bos_token_id"
:
1
,
"eos_token_id"
:
2
,
"transformers_version"
:
"5.2.0"
}