h34v7 commited on
Commit
09718c4
·
verified ·
1 Parent(s): 01b90ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -7,13 +7,22 @@ library_name: transformers
7
  tags:
8
  - mergekit
9
  - merge
10
-
11
  ---
12
- # .
13
 
14
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
 
 
15
 
16
  ## Merge Details
 
 
 
17
  ### Merge Method
18
 
19
  This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf](https://huggingface.co/ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf) as a base.
@@ -47,4 +56,4 @@ parameters:
47
  dtype: bfloat16
48
  tokenizer:
49
  source: ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf
50
- ```
 
7
  tags:
8
  - mergekit
9
  - merge
10
+ license: apache-2.0
11
  ---
12
+ # DXP-Zero-V1.0-24b-Small-Instruct
13
 
14
+ So i was browsing for Mistral finetune and found this base [model](https://huggingface.co/ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf) by [ZeroAgency](https://huggingface.co/ZeroAgency), and oh boy... It was perfect! So here are few notable improvements i observed.
15
+
16
+ Pros:
17
+ - Increased output for storytelling or roleplay.
18
+ - Dynamic output (it can adjust how much output, i mean like when you go with shorter prompt it will do smaller outputs and so does with longer prompt more output too).
19
+ - Less repetitive (though it depends on your own prompt and settings).
20
+ - I have tested with 49444/65536 tokens no degradation although i notice it's actually learning the context better and it's impacting the output a lot. (what i don't like is, it's learning the previous context(of turns) too quickly and set it as new standards.).
21
 
22
  ## Merge Details
23
+
24
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
25
+
26
  ### Merge Method
27
 
28
  This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf](https://huggingface.co/ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf) as a base.
 
56
  dtype: bfloat16
57
  tokenizer:
58
  source: ZeroAgency/Mistral-Small-3.1-24B-Instruct-2503-hf
59
+ ```