a6188466 commited on
Commit
59417e1
·
verified ·
1 Parent(s): 60eb995

adapter: retrain after next-week anchoring dataset fix and update README with test results

Browse files
Files changed (3) hide show
  1. README.md +10 -10
  2. adapter_config.json +1 -1
  3. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -61,7 +61,7 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
61
  {
62
  "num_train_epochs": 15,
63
  "train_batch_size": 1,
64
- "learning_rate": 5e-06,
65
  "lr_scheduler_type": "cosine",
66
  "warmup_ratio": 0.2,
67
  "bf16": true,
@@ -73,13 +73,13 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
73
 
74
  ```json
75
  {
76
- "mean_token_accuracy": 0.9563815355300903,
77
  "total_flos": 2946032837130240.0,
78
- "train_loss": 0.6184668624488107,
79
- "train_runtime": 487.3092,
80
- "train_samples_per_second": 4.217,
81
- "train_steps_per_second": 4.217,
82
- "final_learning_rate": 1.0269727355813331e-09
83
  }
84
  ```
85
 
@@ -88,9 +88,9 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
88
  - **Eval set:** Natural language queries similar in structure and intent to the training examples
89
  - **Metric:** Functional equivalence — two DSL expressions are considered correct if they evaluate to the same result
90
  - **Results:**
91
- - **90%** on held-out [test set](https://huggingface.co/datasets/a6188466/mini-date-converter-dsl-dataset) (**45/50** passed)
92
- - **Test, variation 1:** **334/336** passed (**99.40%**)
93
- - **Test, variation 2:** **308/336** passed (**91.67%**)
94
 
95
  The two variations target different generalization behaviors:
96
  - **Variation 1:** Alternate phrasings of the template-based date expression "the `nth` `weekday` of `month`" (e.g., "the second Tuesday in March")
 
61
  {
62
  "num_train_epochs": 15,
63
  "train_batch_size": 1,
64
+ "learning_rate": 1e-05,
65
  "lr_scheduler_type": "cosine",
66
  "warmup_ratio": 0.2,
67
  "bf16": true,
 
73
 
74
  ```json
75
  {
76
+ "mean_token_accuracy": 0.9686286966005961,
77
  "total_flos": 2946032837130240.0,
78
+ "train_loss": 0.4723904400846384,
79
+ "train_runtime": 468.4155,
80
+ "train_samples_per_second": 4.387,
81
+ "train_steps_per_second": 4.387,
82
+ "final_learning_rate": 2.0539454711626663e-09
83
  }
84
  ```
85
 
 
88
  - **Eval set:** Natural language queries similar in structure and intent to the training examples
89
  - **Metric:** Functional equivalence — two DSL expressions are considered correct if they evaluate to the same result
90
  - **Results:**
91
+ - **92%** on held-out [test set](https://huggingface.co/datasets/a6188466/mini-date-converter-dsl-dataset) (**46/50** passed)
92
+ - **Test, variation 1:** **335/336** passed (**99.70%**)
93
+ - **Test, variation 2:** **333/336** passed (**99.11%**)
94
 
95
  The two variations target different generalization behaviors:
96
  - **Variation 1:** Alternate phrasings of the template-based date expression "the `nth` `weekday` of `month`" (e.g., "the second Tuesday in March")
adapter_config.json CHANGED
@@ -24,9 +24,9 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "qkv_proj",
28
  "down_proj",
29
  "gate_up_proj",
 
30
  "o_proj"
31
  ],
32
  "task_type": "CAUSAL_LM",
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
27
  "down_proj",
28
  "gate_up_proj",
29
+ "qkv_proj",
30
  "o_proj"
31
  ],
32
  "task_type": "CAUSAL_LM",
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7c7c57e27aeeb21202c9c83a43b992c07c15ffe21485a87182aa12e74c0b9934
3
  size 92309112
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1e29176e41c21d02ab7907d74f92135113b7927af02ec6f31e8c5c572ae8934
3
  size 92309112