adapter: retrain after next-week anchoring dataset fix and update README with test results
Browse files- README.md +10 -10
- adapter_config.json +1 -1
- adapter_model.safetensors +1 -1
README.md
CHANGED
|
@@ -61,7 +61,7 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
|
|
| 61 |
{
|
| 62 |
"num_train_epochs": 15,
|
| 63 |
"train_batch_size": 1,
|
| 64 |
-
"learning_rate":
|
| 65 |
"lr_scheduler_type": "cosine",
|
| 66 |
"warmup_ratio": 0.2,
|
| 67 |
"bf16": true,
|
|
@@ -73,13 +73,13 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
|
|
| 73 |
|
| 74 |
```json
|
| 75 |
{
|
| 76 |
-
"mean_token_accuracy": 0.
|
| 77 |
"total_flos": 2946032837130240.0,
|
| 78 |
-
"train_loss": 0.
|
| 79 |
-
"train_runtime":
|
| 80 |
-
"train_samples_per_second": 4.
|
| 81 |
-
"train_steps_per_second": 4.
|
| 82 |
-
"final_learning_rate":
|
| 83 |
}
|
| 84 |
```
|
| 85 |
|
|
@@ -88,9 +88,9 @@ Trained on [a6188466/mini-date-converter-dsl-dataset](https://huggingface.co/dat
|
|
| 88 |
- **Eval set:** Natural language queries similar in structure and intent to the training examples
|
| 89 |
- **Metric:** Functional equivalence — two DSL expressions are considered correct if they evaluate to the same result
|
| 90 |
- **Results:**
|
| 91 |
-
- **
|
| 92 |
-
- **Test, variation 1:** **
|
| 93 |
-
- **Test, variation 2:** **
|
| 94 |
|
| 95 |
The two variations target different generalization behaviors:
|
| 96 |
- **Variation 1:** Alternate phrasings of the template-based date expression "the `nth` `weekday` of `month`" (e.g., "the second Tuesday in March")
|
|
|
|
| 61 |
{
|
| 62 |
"num_train_epochs": 15,
|
| 63 |
"train_batch_size": 1,
|
| 64 |
+
"learning_rate": 1e-05,
|
| 65 |
"lr_scheduler_type": "cosine",
|
| 66 |
"warmup_ratio": 0.2,
|
| 67 |
"bf16": true,
|
|
|
|
| 73 |
|
| 74 |
```json
|
| 75 |
{
|
| 76 |
+
"mean_token_accuracy": 0.9686286966005961,
|
| 77 |
"total_flos": 2946032837130240.0,
|
| 78 |
+
"train_loss": 0.4723904400846384,
|
| 79 |
+
"train_runtime": 468.4155,
|
| 80 |
+
"train_samples_per_second": 4.387,
|
| 81 |
+
"train_steps_per_second": 4.387,
|
| 82 |
+
"final_learning_rate": 2.0539454711626663e-09
|
| 83 |
}
|
| 84 |
```
|
| 85 |
|
|
|
|
| 88 |
- **Eval set:** Natural language queries similar in structure and intent to the training examples
|
| 89 |
- **Metric:** Functional equivalence — two DSL expressions are considered correct if they evaluate to the same result
|
| 90 |
- **Results:**
|
| 91 |
+
- **92%** on held-out [test set](https://huggingface.co/datasets/a6188466/mini-date-converter-dsl-dataset) (**46/50** passed)
|
| 92 |
+
- **Test, variation 1:** **335/336** passed (**99.70%**)
|
| 93 |
+
- **Test, variation 2:** **333/336** passed (**99.11%**)
|
| 94 |
|
| 95 |
The two variations target different generalization behaviors:
|
| 96 |
- **Variation 1:** Alternate phrasings of the template-based date expression "the `nth` `weekday` of `month`" (e.g., "the second Tuesday in March")
|
adapter_config.json
CHANGED
|
@@ -24,9 +24,9 @@
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
| 27 |
-
"qkv_proj",
|
| 28 |
"down_proj",
|
| 29 |
"gate_up_proj",
|
|
|
|
| 30 |
"o_proj"
|
| 31 |
],
|
| 32 |
"task_type": "CAUSAL_LM",
|
|
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
|
|
|
| 27 |
"down_proj",
|
| 28 |
"gate_up_proj",
|
| 29 |
+
"qkv_proj",
|
| 30 |
"o_proj"
|
| 31 |
],
|
| 32 |
"task_type": "CAUSAL_LM",
|
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 92309112
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f1e29176e41c21d02ab7907d74f92135113b7927af02ec6f31e8c5c572ae8934
|
| 3 |
size 92309112
|