euclaise
/

ReMask-3B

Text Generation

Model card Files Files and versions

euclaise commited on Mar 28, 2024

Commit

c404867

·

verified ·

1 Parent(s): b57c984

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -87,4 +87,9 @@ Keeping this in mind:
 - The exact answer is always important and is always a few tokens. Hence, we do not mask the labels or input tokens for the answer value.
 - Rarely, we ignore the rationale labels entirely, such that the model is only pushed to learn what leads to the best answer.
-## Results

 - The exact answer is always important and is always a few tokens. Hence, we do not mask the labels or input tokens for the answer value.
 - Rarely, we ignore the rationale labels entirely, such that the model is only pushed to learn what leads to the best answer.
+## Results
+I trained StableLM-3B-4e1t repeatedly on [https://huggingface.co/datasets/euclaise/TinyCoT](TinyCoT), along with 1000 examples from [reddit-instruct-curated](https://huggingface.co/datasets/euclaise/reddit-instruct-curated) and 1000 examples from [oasst2-curated](https://huggingface.co/datasets/sablo/oasst2_curated).
+I trained once with ReMask (ReMask-CoT for CoT examples), once with Masked Thought (w/ partial label-masking), and once with SFT.