leonardlin commited on
Commit
7dc0e2a
·
verified ·
1 Parent(s): 254853c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ datasets:
18
 
19
  # Shisa V1 7B V2.1
20
 
21
- This release is a bit of a meme model to celebrate the 2-year anniversary of the release of [Shisa 7B V1](https://huggingface.co/augmxnt/shisa-7b-v1), but I was genuinely curious to see how much the original [Mistral 7B v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)-based model could be improved with our latest V2.1 training. How much of our improvements are due to better post-training vs better base models?
22
 
23
  Beyond that curiousity, there is also *some* practical utility, as our [shisa-v1 tokenizer](https://github.com/shisa-ai/shisa-v2/blob/main/eval/tokenizer-efficiency/tokenizer-eval-ja.md) remains one of the most efficient tokenizers for Japanese text. (We've since abandoned tokenizer extension as the amount of continued-pre training required to recover performance and crucially, to resolve token leakage, are not a good trade-off for us.)
24
 
 
18
 
19
  # Shisa V1 7B V2.1
20
 
21
+ This release is a bit of a meme model to celebrate the 2-year anniversary of the release of [Shisa 7B V1](https://huggingface.co/augmxnt/shisa-7b-v1), but I was genuinely curious to see how much the original [Mistral 7B v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)-based model could be improved with our latest [Shisa V2.1](https://huggingface.co/collections/shisa-ai/shisa-v21) training. How much of our improvements are due to better post-training vs better base models?
22
 
23
  Beyond that curiousity, there is also *some* practical utility, as our [shisa-v1 tokenizer](https://github.com/shisa-ai/shisa-v2/blob/main/eval/tokenizer-efficiency/tokenizer-eval-ja.md) remains one of the most efficient tokenizers for Japanese text. (We've since abandoned tokenizer extension as the amount of continued-pre training required to recover performance and crucially, to resolve token leakage, are not a good trade-off for us.)
24