RWKV7-G1 "GooseOne" pure RNN reasoning model
These are BASE models (pretrained with web/code/synthetic + instruction/chat/reasoning data), suitable for post-training and fine-tuning (check https://huggingface.co/spaces/Jellyfish042/UncheatableEval to see their performance at language modeling).
More info & Gradio demo: https://rwkv.com/
For developers: https://github.com/BlinkDL/RWKV-LM
RWKV-7 pth => GGUF script: https://github.com/MollySophia/rwkv-mobile/blob/master/converter/convert_rwkv_pth_to_gguf.py
GGUF: https://huggingface.co/collections/shoumenchougou/rwkv7-gxx-gguf
Ollama GGUF: https://ollama.com/mollysama
Use rwkv pip package 0.8.29+ for RWKV-7 inference: https://pypi.org/project/rwkv/
Efficient inference project: https://github.com/BlinkDL/Albatross
RWKV APP: https://github.com/RWKV-APP/RWKV_APP (local inference on Android/iOS)
Please always use latest models (with newest date) (better at everything).
Decoding Suggestion (note: this is for RWKV pip pkg, which apply temp after topp):
Chat: temp 1, topp 0.5, alpha_presence 2, alpha_frequency 0.1, alpha_decay 0.99
Creative (great for fiction etc.): temp 0.6, topp 0.6 ~ 0.8, alpha_presence 2, alpha_frequency 0.2, alpha_decay 0.99
There should not be any space at the end of your input (so strip it) or you will upset the tokenizer and see non-English reponse.
Chat prompt (note: better replace all \n\n in USER_PROMPT to \n as i am using \n\n as "chat round separator" in pretrain data):
System: YOU_CAN_USE_SYSTEM_IF_NEEDED
User: PREVIOUS_STUFF
Assistant: PREVIOUS_STUFF
User: USER_PROMPT
Assistant:
Think prompt (for hard prompts):
User: USER_PROMPT
Assistant: <think
Fake think prompt (great result, highly recommended):
User: USER_PROMPT
Assistant: <think></think
Think prompt, alternative style, for G1c and newer models. Note there is a space before the "(think)" after USER_PROMPT:
User: USER_PROMPT (think)
Assistant: <think
Shorter think, same style:
User: USER_PROMPT (think a bit)
Assistant: <think
Longer think, same style:
User: USER_PROMPT (think a lot)
Assistant: <think
FIM prompt (for G1c and newer models, works for text & code & everything):
✿prefix✿When I was young, I only liked to✿suffix✿and that’s how first I got interested in AI research.✿middle✿
Better (recommended):
✿prefix✿✿suffix✿and that’s how first I got interested in AI research.✿middle✿When I was young, I only liked to
Note "✿" will always be tokenized to one single token in RWKV tokenizer, so I picked it.
Gxx = Data Version
G0x = less than 1 epoch, as training 1 epoch for a large model is expensive :(
G0 G0a G0a2 G0a3 ... G0b ... = adding more (newer and better) data, so G0a has better quality (but less) data than G1
G1x = more than 1 epoch
G1 G1a G1a2 G1a3 ... G1b ... = adding more (newer and better) data, note G1a has better quality (and more) data than G0a