File size: 3,792 Bytes
570820a
 
 
 
 
 
 
 
 
e7ad63b
 
 
 
 
570820a
 
 
5f13795
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6032580
5f13795
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ab69c74
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
base_model: HuggingFaceTB/SmolLM3-3B-Base
library_name: peft
tags:
- base_model:adapter:HuggingFaceTB/SmolLM3-3B-Base
- lora
- sft
- transformers
- trl
license: mit
datasets:
- teknium/OpenHermes-2.5
language:
- en
---


# Model Card: SmolLM3-Chat-v1-adapter

This repository contains the **LoRA (Low-Rank Adaptation)** weights for **SmolLM3-Chat-v1**.

This adapter was trained to give the [SmolLM3-3B-Base](https://huggingface.co/HuggingFaceTB/SmolLM3-3B-Base) model a casual, witty, and "internet-native" personality. It moves away from robotic assistant responses in favor of a more human-like vibe.

## ๐Ÿ”— Related Models
*   **Merged Version (Float16):** [SmolLM3-Chat-v1](https://huggingface.co/igidn/SmolLM3-Chat-v1)
*   **Base Model:** [HuggingFaceTB/SmolLM3-3B-Base](https://huggingface.co/HuggingFaceTB/SmolLM3-3B-Base)

## โš ๏ธ System Instructions (Important)
**Less is more.**

This model relies on a specific "vibe" learned during training. Over-prompting it with complex system instructions (e.g., *"You are a helpful assistant who is polite, follows rules X, Y, Z..."*) will degrade the output quality.

**Recommended System Prompt:**
*(simply leave it empty for the most raw, casual experience)*

## ๐Ÿ’ป Usage (4-Bit Loading)

This script demonstrates how to load the base model in 4-bit and attach the adapter.

```python
import torch
from threading import Thread
from peft import PeftModel
from transformers import (
    AutoModelForCausalLM, 
    AutoTokenizer, 
    BitsAndBytesConfig, 
    TextIteratorStreamer
)

# 1. Define IDs
ADAPTER_ID = "igidn/SmolLM3-Chat-v1-adapter"
BASE_MODEL_ID = "HuggingFaceTB/SmolLM3-3B-Base"

# 2. Quantization Config (4-bit)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# 3. Load Base Model
tokenizer = AutoTokenizer.from_pretrained(ADAPTER_ID) # Load tokenizer from adapter to get special tokens
model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

# 4. Attach Adapter
model = PeftModel.from_pretrained(model, ADAPTER_ID)

# 5. Define Conversation
messages = [
    {"role": "user", "content": "Haiiii"}
]

# 6. Apply Chat Template
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer([prompt], return_tensors="pt").to(model.device)

# 7. Streamer & Generation
streamer = TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)

# --- CRITICAL GENERATION CONFIG ---
generate_kwargs = dict(
    **inputs,
    streamer=streamer,
    max_new_tokens=512,
    do_sample=True,
    
    # Core Vibe Parameters
    temperature=0.8,
    top_p=0.85,
    
    # Stability Parameters (Prevents looping)
    repetition_penalty=1.15,
    no_repeat_ngram_size=3,
    
    pad_token_id=tokenizer.eos_token_id
)

thread = Thread(target=model.generate, kwargs=generate_kwargs)
thread.start()

print("Assistant: ", end="")
for new_text in streamer:
    print(new_text, end="", flush=True)
```

## ๐Ÿ“Š Training Details

The model was trained for 2 epochs using `SFTTrainer`.

### Dataset
*   **OpenHermes-2.5 (5k subset):** Logic and general helpfulness.
*   **Custom Dataset (15k):** Casual chat, roleplay, and human-like interaction patterns.

### Metrics
| Metric | Value |
| :--- | :--- |
| **Final Loss** | 1.41 |
| **Final Token Accuracy** | ~65.9% |

## ๐Ÿ› ๏ธ Hyperparameters
*   **Rank (r):** 32
*   **Alpha:** 64
*   **Dropout:** 0.05
*   **Target Modules:** All linear layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`, `embed_tokens`, `lm_head`)

*Created with <3 by me*