metadata
license: apache-2.0
Mistral-7B-v0.1 with the kv heads duplicated to not use GQA. Don't use this directly - it'll give you the same results, just slower. Meant for merging experiments.
license: apache-2.0
Mistral-7B-v0.1 with the kv heads duplicated to not use GQA. Don't use this directly - it'll give you the same results, just slower. Meant for merging experiments.