Raymond-dev-546730 commited on
Commit
11a4806
·
verified ·
1 Parent(s): 9a5583e

Update Training/Training_Documentation.txt

Browse files
Training/Training_Documentation.txt CHANGED
@@ -13,9 +13,12 @@ Training Dataset: Custom curated dataset for materials analysis
13
  Dataset Specifications
14
  ---------------------
15
 
16
- Total Token Count: 6,441,671
17
  Total Sample Count: 6,000
18
- Average Tokens/Sample: 1,073.61
 
 
 
19
  Dataset Creation: Generated using DeepSeekV3 API
20
 
21
  Training Configuration
 
13
  Dataset Specifications
14
  ---------------------
15
 
16
+ Total Token Count: 6,292,692
17
  Total Sample Count: 6,000
18
+ Average Tokens/Sample: 1048.78
19
+ Max Token Count: 1,289
20
+ Min Token Count: 922
21
+ Tokens Counted Using: tiktoken (cl100k_base encoding)
22
  Dataset Creation: Generated using DeepSeekV3 API
23
 
24
  Training Configuration