llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)
Computron-Bots-1.7B-R1-GGUF
Computron-Bots-1.7B-R1 is a general-purpose safe question-answering model fine-tuned from Qwen3-1.7B, specifically designed for direct and efficient factual responses without complex reasoning chains. It provides straightforward, accurate answers across diverse topics, making it ideal for knowledge retrieval, information systems, and applications requiring quick, reliable responses.
Model Files
File Name
Size
Format
Description
Computron-Bots-1.7B-R1.F32.gguf
6.89 GB
F32
Full precision 32-bit floating point
Computron-Bots-1.7B-R1.F16.gguf
3.45 GB
F16
Half precision 16-bit floating point
Computron-Bots-1.7B-R1.BF16.gguf
3.45 GB
BF16
Brain floating point 16-bit
Computron-Bots-1.7B-R1.Q8_0.gguf
1.83 GB
Q8_0
8-bit quantized
Computron-Bots-1.7B-R1.Q6_K.gguf
1.42 GB
Q6_K
6-bit quantized
Computron-Bots-1.7B-R1.Q5_K_M.gguf
1.26 GB
Q5_K_M
5-bit quantized, medium quality
Computron-Bots-1.7B-R1.Q5_K_S.gguf
1.23 GB
Q5_K_S
5-bit quantized, small quality
Computron-Bots-1.7B-R1.Q4_K_M.gguf
1.11 GB
Q4_K_M
4-bit quantized, medium quality
Computron-Bots-1.7B-R1.Q4_K_S.gguf
1.06 GB
Q4_K_S
4-bit quantized, small quality
Computron-Bots-1.7B-R1.Q3_K_L.gguf
1 GB
Q3_K_L
3-bit quantized, large quality
Computron-Bots-1.7B-R1.Q3_K_M.gguf
940 MB
Q3_K_M
3-bit quantized, medium quality
Computron-Bots-1.7B-R1.Q3_K_S.gguf
867 MB
Q3_K_S
3-bit quantized, small quality
Computron-Bots-1.7B-R1.Q2_K.gguf
778 MB
Q2_K
2-bit quantized
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="prithivMLmods/Computron-Bots-1.7B-R1-GGUF", filename="", )