Mistral-7B-MHA / README.md
chargoddard's picture
Create README.md
d4459a9
metadata
license: apache-2.0

Mistral-7B-v0.1 with the kv heads duplicated to not use GQA. Don't use this directly - it'll give you the same results, just slower. Meant for merging experiments.