How to use from
Docker Model Runner
# Gated model: Login with a HF token with gated access permission
hf auth login
docker model run hf.co/g4me/QWiki-Base-LR1e5
Quick Links

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for QWiki-Base-LR1e5

This model is a fine-tuned version of Qwen/Qwen3-1.7B-Base. It has been trained using TRL.

Quick start

from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="g4me/QWiki-Base-LR1e5", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

Training procedure

Visualize in Weights & Biases

This model was trained with SFT.

Framework versions

  • TRL: 0.29.0
  • Transformers: 5.2.0
  • Pytorch: 2.8.0a0+34c6371d24.nv25.8
  • Datasets: 4.6.0
  • Tokenizers: 0.22.2

Citations

Cite TRL as:

@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}
Downloads last month
10
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
Input a message to start chatting with g4me/QWiki-Base-LR1e5.

Model tree for g4me/QWiki-Base-LR1e5

Finetuned
(369)
this model