πŸš— Granite-3.1-2b-FourWheeler

This model is a fine-tuned version of IBM Granite 3.1 2B Instruct, trained on a custom Four Wheeler dataset.

It has been trained using Unsloth for faster and memory-efficient fine-tuning.

πŸ“‚ Included Files

Filename Type Description
model.safetensors Safetensors The full unquantized model weights (for Python/Transformers).
granite-2b-q4_k_m.gguf GGUF (Q4) Recommended. 4-bit quantized version. Fast & low memory (approx 1.5GB).
granite-2b-fp16.gguf GGUF (FP16) High-precision quantized version. Larger size (approx 4.8GB).

πŸ’» How to Use (GGUF / Llama.cpp)

You can use the .gguf files with LM Studio, Ollama, or llama.cpp.

CLI Command:

./llama-cli -m granite-2b-q4_k_m.gguf -p "User: Which is the best 4-wheeler for off-roading?\nAssistant:" -cnv

🐍 How to Use (Python / Transformers)

To use the full model in Python:
Python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Prithwiraj731/Granite-3.1-2b-FourWheeler"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

prompt = "User: Tell me about the engine specifications of a seden car.\nAssistant:"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

πŸ”§ Training Details

    Base Model: ibm-granite/granite-3.1-2b-instruct

    Framework: Unsloth (PyTorch)

    Quantization: Q4_K_M & FP16 GGUF

    Fine-tuning type: LoRA (Low-Rank Adaptation)

Finetuned with ❀️ using Unsloth.
Downloads last month
82
Safetensors
Model size
3B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Prithwiraj731/Granite-3.1-2b-FourWheeler

Quantized
(35)
this model
Quantizations
2 models