pankajmathur commited on
Commit
6cd7c2b
·
verified ·
1 Parent(s): df75442

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -14
README.md CHANGED
@@ -20,23 +20,9 @@ This model is a merged version of [mistralai/Devstral-Small-2507](https://huggin
20
  ## Model Details
21
 
22
  - **Base Model:** [mistralai/Devstral-Small-2507](https://huggingface.co/mistralai/Devstral-Small-2507)
23
- - **LoRA Adapter:** [pankajmathur/Devstral-Small-2507-sft-v1-adapter](https://huggingface.co/pankajmathur/Devstral-Small-2507-sft-v1-adapter)
24
- - **Training Dataset:** [pankajmathur/OpenThoughts-Agent-v1-SFT](https://huggingface.co/datasets/pankajmathur/OpenThoughts-Agent-v1-SFT)
25
  - **Parameters:** ~24B
26
  - **Precision:** bfloat16
27
 
28
- ## Training Configuration
29
-
30
- The LoRA adapter was trained with the following configuration:
31
- - **LoRA Rank (r):** 32
32
- - **LoRA Alpha:** 16
33
- - **LoRA Dropout:** 0.05
34
- - **Target Modules:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
35
- - **Sequence Length:** 8192
36
- - **Learning Rate:** 0.0001
37
- - **Optimizer:** AdamW 8-bit
38
- - **Epochs:** 3
39
-
40
  ## Usage
41
 
42
 
 
20
  ## Model Details
21
 
22
  - **Base Model:** [mistralai/Devstral-Small-2507](https://huggingface.co/mistralai/Devstral-Small-2507)
 
 
23
  - **Parameters:** ~24B
24
  - **Precision:** bfloat16
25
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ## Usage
27
 
28