murataksit34 commited on
Commit
da28775
·
verified ·
1 Parent(s): c3f189f

Fix README with correct datasets/splits and curriculum notes

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -4,7 +4,7 @@ language:
4
  license: apache-2.0
5
  library_name: transformers
6
  tags:
7
- - llama
8
  - english
9
  - data-science
10
  - data-mining
@@ -14,18 +14,18 @@ datasets:
14
  - zero9tech/data-scientist-insight-dialog-en-16.5k
15
  ---
16
 
17
- # Llama-3.1-8B-Data-Science-V2 Decision Architect
18
 
19
  ## Training Setup
20
- Domain SFT on .
21
 
22
  ## Dataset Snapshot
23
- - Total records:
24
- - Split: · ·
25
- - Hard gate:
26
- - Soft gate:
27
- - Self-BLEU (final):
28
- - Distinct-2 (final):
29
 
30
  ## Copyright
31
  Copyright (c) Zero9 Tech
 
4
  license: apache-2.0
5
  library_name: transformers
6
  tags:
7
+ - qwen3
8
  - english
9
  - data-science
10
  - data-mining
 
14
  - zero9tech/data-scientist-insight-dialog-en-16.5k
15
  ---
16
 
17
+ # Llama-3.1-8B-Data-Science-V2 - Decision Architect
18
 
19
  ## Training Setup
20
+ Domain SFT on zero9tech/data-scientist-insight-dialog-en-16.5k.
21
 
22
  ## Dataset Snapshot
23
+ - Total records: 16,463
24
+ - Split: train 14,021 ; validation 801 ; test 1,641
25
+ - Hard gate: PASS
26
+ - Soft gate: PASS
27
+ - Self-BLEU (final): 0.5574
28
+ - Distinct-2 (final): 0.0768
29
 
30
  ## Copyright
31
  Copyright (c) Zero9 Tech