souvik18
/

Roy

Text Generation

instruction-tuning

text-generation-inference

Model card Files Files and versions

souvik18 commited on Dec 17, 2025

Commit

73bb5df

·

verified ·

1 Parent(s): e9a4e09

Create README.md

Files changed (1) hide show

README.md +114 -0

README.md ADDED Viewed

	@@ -0,0 +1,114 @@

+---
+language: en
+license: apache-2.0
+base_model: mistralai/Mistral-7B-Instruct-v0.2
+datasets:
+- souvik18/mistral_tokenized_2048_fixed_v2
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- mistral
+- lora
+- qlora
+- instruction-tuning
+- causal-lm
+metrics:
+- accuracy
+---
+# Roy
+## Model Overview
+**Roy** is a fine-tuned large language model based on
+[`mistralai/Mistral-7B-Instruct-v0.2`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
+The model was trained using **QLoRA** with a resumable streaming pipeline and later **merged into the base model** to produce a **single standalone checkpoint** (no LoRA adapter required at inference time).
+This model is optimized for:
+- Instruction following
+- Conversational responses
+- General reasoning and explanation tasks
+---
+## Base Model
+- **Base:** Mistral-7B-Instruct-v0.2
+- **Architecture:** Decoder-only Transformer
+- **Parameters:** ~7B
+- **Context Length:** 2048 tokens
+---
+## Training Dataset
+The model was trained on a custom tokenized dataset:
+- **Dataset name:** `mistral_tokenized_2048_fixed_v2`
+- **Dataset repository:**
+  https://huggingface.co/datasets/souvik18/mistral_tokenized_2048_fixed_v2
+- **Owner:** souvik18
+- **Format:** Pre-tokenized `input_ids`
+- **Sequence length:** 2048
+- **Tokenizer:** Mistral tokenizer
+- **Dataset size:** ~10.7M tokens
+### Dataset Processing
+- Fixed padding and truncation
+- Removed malformed / corrupted samples
+- Validated against NaN and overflow issues
+- Optimized for streaming-based training
+---
+## Training Method
+- **Fine-tuning method:** QLoRA
+- **Quantization:** 4-bit (NF4)
+- **Optimizer:** AdamW
+- **Learning rate:** 2e-4
+- **LoRA rank (r):** 32
+- **Target modules:**
+  `q_proj`, `k_proj`, `v_proj`, `o_proj`,
+  `gate_proj`, `up_proj`, `down_proj`
+- **Gradient checkpointing:** Enabled
+- **Training style:** Streaming + resumable
+- **Checkpointing:** Hugging Face Hub (HF-only)
+After training, the LoRA adapter was **merged into the base model weights** to create this final model.
+---
+## Inference
+This model can be used **directly** without any LoRA adapter.
+### Example (Transformers)
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_id = "souvik18/Roy"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+    device_map="auto"
+)
+prompt = "[INST] Explain Newton's laws in simple words [/INST]"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+with torch.no_grad():
+    output = model.generate(
+        **inputs,
+        max_new_tokens=200,
+        temperature=0.7,
+        top_p=0.9,
+        do_sample=True
+    )
+print(tokenizer.decode(output[0], skip_special_tokens=True))