DataSnake commited on
Commit
8453c7d
·
verified ·
1 Parent(s): 4c6b5a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - LatitudeGames/Wayfarer-2-12B
7
+ tags:
8
+ - text adventure
9
+ - roleplay
10
+ - nvfp4
11
+ - tensorrt-llm
12
+ model_size: 12B
13
+ datasets:
14
+ - agentlans/distilled-roleplay
15
+ pipeline_tag: text-generation
16
+ ---
17
+ ![image/jpeg](Wayfarer-2-12B.jpg)
18
+
19
+ # Wayfarer-2-12B
20
+
21
+ Quantized NVFP4 weights of the [Wayfarer-2-12B](https://huggingface.co/LatitudeGames/Wayfarer-2-12B) model, for use with nVidia Blackwell GPUs.
22
+
23
+ ## Quantization details
24
+
25
+ Quantized with TensorRT-Model-Optimizer 0.37.0
26
+
27
+ Calibrated using the [distilled-roleplay](https://huggingface.co/datasets/agentlans/distilled-roleplay) dataset, tagged in the same ChatML format used to train the Wayfarer and Muse models in the first place. This was accomplished by adding the following code to the start of `hf_ptq.py`:
28
+
29
+ ```
30
+ from modelopt.torch.utils import dataset_utils
31
+ dataset_utils.SUPPORTED_DATASET_CONFIG["distilled-roleplay"] = {
32
+ "config": {
33
+ "path": "agentlans/distilled-roleplay",
34
+ "split": ["train"],
35
+ },
36
+ "preprocess": lambda sample: "".join(
37
+ f"<|im_start|>{ {'system':'system','human':'user','gpt':'assistant'}[turn['from']] }\n"
38
+ f"{turn['value'].strip()}<|im_end|>\n"
39
+ for turn in sample["conversations"]
40
+ ),
41
+ }
42
+ ```
43
+
44
+ ## Inference
45
+
46
+ Tested on a RTX 5060 Ti 16GB with TensorRT-LLM, vLLM, and SGLang.
47
+
48
+ Recommended generation settings (a mix of what it says on the Wayfarer-2-12B model card, the default AI Dungeon settings for Wayfarer-2-12B, and the [AI Dungeon Model Guide](https://help.aidungeon.com/ai-models-and-their-differences) entry for the original Wayfarer-12B):
49
+ - Temperature: 1.2
50
+ - Top K: 50
51
+ - Top P: 0.9
52
+ - Min P: 0.025
53
+ - Repetition Penalty: 1.05
54
+ - Presence Penalty: 0.2
55
+
56
+ ## Prompt Format
57
+
58
+ As mentioned above, the calibration data was provided with the same ChatML tags as had been used to finetune Latitude's 12B models:
59
+ ```
60
+ <|im_start|>system
61
+ You're a masterful storyteller and gamemaster. Write in second person present tense (You are), crafting vivid, engaging narratives with authority and confidence.<|im_end|>
62
+ <|im_start|>user
63
+ > You peer into the darkness.<|im_end|>
64
+ <|im_start|>assistant
65
+ You have been eaten by a grue.<|im_end|>
66
+ ```
67
+ As such, I would recommend using that format for inference.
68
+
69
+ ## Credits
70
+
71
+ Wayfarer-2-12B was made by [Latitude Games](https://huggingface.co/LatitudeGames) with help from [Gryphe Padar](https://huggingface.co/Gryphe)