Roy

Model Overview

Roy is a fine-tuned large language model based on
mistralai/Mistral-7B-Instruct-v0.2.

The model was trained using QLoRA with a resumable streaming pipeline and later merged into the base model to produce a single standalone checkpoint (no LoRA adapter required at inference time).

This model is optimized for:

  • Instruction following
  • Conversational responses
  • General reasoning and explanation tasks

Base Model

  • Base: Mistral-7B-Instruct-v0.2
  • Architecture: Decoder-only Transformer
  • Parameters: ~7B
  • Context Length: 2048 tokens

Training Dataset

The model was trained on a custom tokenized dataset:

Dataset Processing

  • Fixed padding and truncation
  • Removed malformed / corrupted samples
  • Validated against NaN and overflow issues
  • Optimized for streaming-based training

Training Method

  • Fine-tuning method: QLoRA
  • Quantization: 4-bit (NF4)
  • Optimizer: AdamW
  • Learning rate: 2e-4
  • LoRA rank (r): 32
  • Target modules:
    q_proj, k_proj, v_proj, o_proj,
    gate_proj, up_proj, down_proj
  • Gradient checkpointing: Enabled
  • Training style: Streaming + resumable
  • Checkpointing: Hugging Face Hub (HF-only)

After training, the LoRA adapter was merged into the base model weights to create this final model.


Inference

This model can be used directly without any LoRA adapter.

Example (Transformers)

!pip uninstall -y transformers peft accelerate torch safetensors numpy
!pip install numpy==1.26.4
!pip install torch==2.2.2
!pip install transformers==4.41.2
!pip install peft==0.11.1
!pip install accelerate==0.30.1
!pip install safetensors==0.4.3

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# -----------------------------
# CONFIG
# -----------------------------
MODEL_ID = "souvik18/Roy"
DTYPE = torch.float16   # use float16 for GPU

# -----------------------------
# LOAD TOKENIZER & MODEL
# -----------------------------
print("๐Ÿ”น Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
tokenizer.pad_token = tokenizer.eos_token

print("๐Ÿ”น Loading model...")
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=DTYPE,
    device_map="auto"
)
model.eval()

print("\nโœ… Model loaded successfully")
print("Type 'exit' or 'quit' to stop\n")

# -----------------------------
# CHAT LOOP
# -----------------------------
while True:
    user_input = input("๐Ÿง‘ You: ").strip()

    if user_input.lower() in ["exit", "quit"]:
        print("๐Ÿ‘‹ Bye!")
        break

    prompt = f"[INST] {user_input} [/INST]"

    inputs = tokenizer(
        prompt,
        return_tensors="pt"
    ).to(model.device)

    with torch.no_grad():
        output = model.generate(
            **inputs,
            max_new_tokens=200,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            repetition_penalty=1.1,
            eos_token_id=tokenizer.eos_token_id,
        )

    response = tokenizer.decode(output[0], skip_special_tokens=True)
    print(f"\n Roy: {response}\n")
Downloads last month
9
Safetensors
Model size
7B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for souvik18/Roy

Adapter
(1143)
this model
Adapters
1 model

Dataset used to train souvik18/Roy