Text Generation
PEFT
Safetensors
English
qlora
lora
structured-output
mt628754 commited on
Commit
4ffcf39
·
verified ·
1 Parent(s): 1067908

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -28,8 +28,8 @@ The base model must be loaded separately.
28
  This adapter is trained to improve **structured output accuracy**
29
  (JSON / YAML / XML / TOML / CSV).
30
 
31
- Loss is applied only to the final assistant output,
32
- while intermediate reasoning (Chain-of-Thought) is masked.
33
 
34
  ## Training Configuration
35
 
@@ -63,5 +63,7 @@ model = PeftModel.from_pretrained(model, adapter)
63
 
64
  Training data: u-10bei/structured_data_with_cot_dataset_512_v2, u-10bei/structured_data_with_cot_dataset_512_v4, u-10bei/structured_data_with_cot_dataset_512_v5
65
 
 
 
66
  Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
67
  Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.
 
28
  This adapter is trained to improve **structured output accuracy**
29
  (JSON / YAML / XML / TOML / CSV).
30
 
31
+ Chain-of-Thought reasoning was removed from training data,
32
+ and loss is applied directly to the final structured output.
33
 
34
  ## Training Configuration
35
 
 
63
 
64
  Training data: u-10bei/structured_data_with_cot_dataset_512_v2, u-10bei/structured_data_with_cot_dataset_512_v4, u-10bei/structured_data_with_cot_dataset_512_v5
65
 
66
+ Data preprocessing: Combined the above three versions with removal of unparseable outputs, deduplication, and removal of Chain-of-Thought reasoning from assistant responses.
67
+
68
  Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
69
  Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.