basmala12 commited on
Commit
9adc488
·
verified ·
1 Parent(s): 787ae53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -39
README.md CHANGED
@@ -1,58 +1,71 @@
1
- smollm_finetuning5 – Fine-Tuned SmolLM2 Model
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- smollm_finetuning5 is a fine-tuned variant of the SmolLM2-1.7B model.
4
- The aim of this work was to adapt the base model using a lightweight instruction-tuning approach to improve coherence, reasoning, and general instruction-following on short prompts.
5
 
6
- The model is provided as a merged .safetensors checkpoint, meaning the LoRA adapters were fused into the base weights after training for easier deployment.
7
 
8
- Dataset Used
9
 
10
- The model was trained on the argilla/synthetic-concise-reasoning-sft-filtered dataset.
11
- This dataset includes:
12
 
13
- instruction–response pairs
14
 
15
- short reasoning sequences
 
 
 
 
16
 
17
- synthetic tasks designed to promote clear step-by-step thinking
18
 
19
- concise explanations and simplified reasoning samples
20
 
21
- The dataset is filtered to remove excessively long or noisy examples, making it suitable for fine-tuning compact models that benefit from clean and simplified instructions.
 
 
 
 
22
 
23
- Training Method
 
 
 
24
 
25
- Fine-tuning was conducted using LoRA (Low-Rank Adaptation) to reduce hardware requirements and allow efficient experimentation.
26
- Key training characteristics:
27
 
28
- Base model: SmolAI / SmolLM2-1.7B
29
 
30
- Method: LoRA fine-tuning
31
 
32
- Adapters were merged after training
 
 
 
33
 
34
- Precision: FP32 safetensors
35
 
36
- Training epochs: 3
37
 
38
- Learning rate: 2e-4
39
 
40
- Chat template included in the repository (chat_template.jinja)
 
 
 
 
 
41
 
42
- The fine-tuning process focused on improving instruction clarity and response structure rather than domain specialization.
43
-
44
- Model Files
45
-
46
- The repository contains all necessary files to load the model with standard tooling:
47
-
48
- model.safetensors (merged model weights)
49
-
50
- config.json
51
-
52
- generation_config.json
53
-
54
- tokenizer.json, vocab.json, merges.txt
55
-
56
- special_tokens_map.json
57
-
58
- chat_template.jinja
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: text-generation
4
+ base_model: SmolAI/SmolLM2-1.7B
5
+ license: apache-2.0
6
+ language:
7
+ - en
8
+ tags:
9
+ - smolllm2
10
+ - finetuned
11
+ - reasoning
12
+ - concise
13
+ model_type: causal-lm
14
+ ---
15
 
16
+ # smollm_finetuning5 Fine-Tuned SmolLM2-1.7B for Concise Instruction Reasoning
 
17
 
18
+ *smollm_finetuning5* is a fine-tuned version of *SmolAI/SmolLM2-1.7B*, trained on synthetic instruction–response samples and concise reasoning data. The model is optimized to produce short, structured, and clear answers while improving general instruction-following behavior.
19
 
20
+ The goal of this fine-tuning was to enhance reasoning clarity and response consistency in a compact 1.7B parameter model.
21
 
22
+ ---
 
23
 
24
+ ## Features
25
 
26
+ - Fine-tuned for concise and structured responses
27
+ - Improved instruction-following capabilities
28
+ - Handles short reasoning and explanation tasks
29
+ - Lightweight and efficient (1.7B parameters)
30
+ - Suitable for general-purpose educational and reasoning uses
31
 
32
+ ---
33
 
34
+ ## Intended Use
35
 
36
+ ### Recommended
37
+ - General question–answer interactions
38
+ - Explanation of simple topics
39
+ - Short reasoning steps
40
+ - Instruction–response tasks
41
 
42
+ ### Not Recommended
43
+ - High-stakes or decision-critical applications
44
+ - Domain-specific or specialized factual tasks
45
+ - Situations requiring verified accuracy
46
 
47
+ ---
 
48
 
49
+ ## Training Data
50
 
51
+ The model was fine-tuned on:
52
 
53
+ - argilla/synthetic-concise-reasoning-sft-filtered
54
+ - Instruction–answer pairs
55
+ - Synthetic reasoning prompts
56
+ - Concise explanation samples
57
 
58
+ The dataset consists of simplified synthetic data designed to enhance clarity, reasoning, and instruction handling.
59
 
60
+ ---
61
 
62
+ ## Training Details
63
 
64
+ - Base Model: SmolAI/SmolLM2-1.7B
65
+ - Fine-Tuning Method: LoRA adapters (merged into final weights)
66
+ - Epochs: 3
67
+ - Learning Rate: 2e-4
68
+ - Loss: Causal language modeling
69
+ - Output Format: FP32 safetensors
70
 
71
+ ---