adapter: retrain after next-week anchoring dataset fix and update README with test results

Files changed (3) hide show

README.md CHANGED Viewed

@@ -61,7 +61,7 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
 {
   "num_train_epochs": 15,
   "train_batch_size": 1,
-  "learning_rate": 5e-06,
   "lr_scheduler_type": "cosine",
   "warmup_ratio": 0.2,
   "bf16": true,
@@ -73,13 +73,13 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
 ```json
 {
-  "mean_token_accuracy": 0.9563815355300903,
   "total_flos": 2946032837130240.0,
-  "train_loss": 0.6184668624488107,
-  "train_runtime": 487.3092,
-  "train_samples_per_second": 4.217,
-  "train_steps_per_second": 4.217,
-  "final_learning_rate": 1.0269727355813331e-09
 }
 ```
@@ -88,9 +88,9 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
 - **Eval set:** Natural language queries similar in structure and intent to the training examples
 - **Metric:** Functional equivalence — two DSL expressions are considered correct if they evaluate to the same result
 - **Results:**
-  - **90%** on held-out [test set](https://huggingface.co/datasets/a6188466/mini-date-converter-dsl-dataset) (**45/50** passed)
-  - **Test, variation 1:** **334/336** passed (**99.40%**)
-  - **Test, variation 2:** **308/336** passed (**91.67%**)
 The two variations target different generalization behaviors:
 - **Variation 1:** Alternate phrasings of the template-based date expression "the `nth` `weekday` of `month`" (e.g., "the second Tuesday in March")

 {
   "num_train_epochs": 15,
   "train_batch_size": 1,
+  "learning_rate": 1e-05,
   "lr_scheduler_type": "cosine",
   "warmup_ratio": 0.2,
   "bf16": true,
 ```json
 {
+  "mean_token_accuracy": 0.9686286966005961,
   "total_flos": 2946032837130240.0,
+  "train_loss": 0.4723904400846384,
+  "train_runtime": 468.4155,
+  "train_samples_per_second": 4.387,
+  "train_steps_per_second": 4.387,
+  "final_learning_rate": 2.0539454711626663e-09
 }
 ```
 - **Eval set:** Natural language queries similar in structure and intent to the training examples
 - **Metric:** Functional equivalence — two DSL expressions are considered correct if they evaluate to the same result
 - **Results:**
+  - **92%** on held-out [test set](https://huggingface.co/datasets/a6188466/mini-date-converter-dsl-dataset) (**46/50** passed)
+  - **Test, variation 1:** **335/336** passed (**99.70%**)
+  - **Test, variation 2:** **333/336** passed (**99.11%**)
 The two variations target different generalization behaviors:
 - **Variation 1:** Alternate phrasings of the template-based date expression "the `nth` `weekday` of `month`" (e.g., "the second Tuesday in March")

adapter_config.json CHANGED Viewed

@@ -24,9 +24,9 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "qkv_proj",
     "down_proj",
     "gate_up_proj",
     "o_proj"
   ],
   "task_type": "CAUSAL_LM",

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "down_proj",
     "gate_up_proj",
+    "qkv_proj",
     "o_proj"
   ],
   "task_type": "CAUSAL_LM",

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7c7c57e27aeeb21202c9c83a43b992c07c15ffe21485a87182aa12e74c0b9934
 size 92309112

 version https://git-lfs.github.com/spec/v1
+oid sha256:f1e29176e41c21d02ab7907d74f92135113b7927af02ec6f31e8c5c572ae8934
 size 92309112