souvik18 commited on
Commit
73bb5df
·
verified ·
1 Parent(s): e9a4e09

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ base_model: mistralai/Mistral-7B-Instruct-v0.2
5
+ datasets:
6
+ - souvik18/mistral_tokenized_2048_fixed_v2
7
+ pipeline_tag: text-generation
8
+ library_name: transformers
9
+ tags:
10
+ - mistral
11
+ - lora
12
+ - qlora
13
+ - instruction-tuning
14
+ - causal-lm
15
+ metrics:
16
+ - accuracy
17
+ ---
18
+
19
+ # Roy
20
+
21
+ ## Model Overview
22
+
23
+ **Roy** is a fine-tuned large language model based on
24
+ [`mistralai/Mistral-7B-Instruct-v0.2`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
25
+
26
+ The model was trained using **QLoRA** with a resumable streaming pipeline and later **merged into the base model** to produce a **single standalone checkpoint** (no LoRA adapter required at inference time).
27
+
28
+ This model is optimized for:
29
+ - Instruction following
30
+ - Conversational responses
31
+ - General reasoning and explanation tasks
32
+
33
+ ---
34
+
35
+ ## Base Model
36
+
37
+ - **Base:** Mistral-7B-Instruct-v0.2
38
+ - **Architecture:** Decoder-only Transformer
39
+ - **Parameters:** ~7B
40
+ - **Context Length:** 2048 tokens
41
+
42
+ ---
43
+
44
+ ## Training Dataset
45
+
46
+ The model was trained on a custom tokenized dataset:
47
+
48
+ - **Dataset name:** `mistral_tokenized_2048_fixed_v2`
49
+ - **Dataset repository:**
50
+ https://huggingface.co/datasets/souvik18/mistral_tokenized_2048_fixed_v2
51
+ - **Owner:** souvik18
52
+ - **Format:** Pre-tokenized `input_ids`
53
+ - **Sequence length:** 2048
54
+ - **Tokenizer:** Mistral tokenizer
55
+ - **Dataset size:** ~10.7M tokens
56
+
57
+ ### Dataset Processing
58
+ - Fixed padding and truncation
59
+ - Removed malformed / corrupted samples
60
+ - Validated against NaN and overflow issues
61
+ - Optimized for streaming-based training
62
+
63
+ ---
64
+
65
+ ## Training Method
66
+
67
+ - **Fine-tuning method:** QLoRA
68
+ - **Quantization:** 4-bit (NF4)
69
+ - **Optimizer:** AdamW
70
+ - **Learning rate:** 2e-4
71
+ - **LoRA rank (r):** 32
72
+ - **Target modules:**
73
+ `q_proj`, `k_proj`, `v_proj`, `o_proj`,
74
+ `gate_proj`, `up_proj`, `down_proj`
75
+ - **Gradient checkpointing:** Enabled
76
+ - **Training style:** Streaming + resumable
77
+ - **Checkpointing:** Hugging Face Hub (HF-only)
78
+
79
+ After training, the LoRA adapter was **merged into the base model weights** to create this final model.
80
+
81
+ ---
82
+
83
+ ## Inference
84
+
85
+ This model can be used **directly** without any LoRA adapter.
86
+
87
+ ### Example (Transformers)
88
+
89
+ ```python
90
+ from transformers import AutoTokenizer, AutoModelForCausalLM
91
+ import torch
92
+
93
+ model_id = "souvik18/Roy"
94
+
95
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
96
+ model = AutoModelForCausalLM.from_pretrained(
97
+ model_id,
98
+ torch_dtype=torch.float16,
99
+ device_map="auto"
100
+ )
101
+
102
+ prompt = "[INST] Explain Newton's laws in simple words [/INST]"
103
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
104
+
105
+ with torch.no_grad():
106
+ output = model.generate(
107
+ **inputs,
108
+ max_new_tokens=200,
109
+ temperature=0.7,
110
+ top_p=0.9,
111
+ do_sample=True
112
+ )
113
+
114
+ print(tokenizer.decode(output[0], skip_special_tokens=True))