Update README.md
#1
by katek - opened
README.md
CHANGED
|
@@ -566,8 +566,7 @@ language:
|
|
| 566 |
|
| 567 |
# Refact-1.6B
|
| 568 |
|
| 569 |
-
Finally, the model we started training with our blog post
|
| 570 |
-
[Applying Recent Innovations](https://refact.ai/blog/2023/applying-recent-innovations-to-train-model/) is ready 🎉
|
| 571 |
|
| 572 |
After fine-tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats
|
| 573 |
StarCoder ten times the size!
|
|
@@ -614,7 +613,7 @@ Filtering is the key to success of this model:
|
|
| 614 |
The text to code proportion was 50:50, model trained for 1.2T tokens.
|
| 615 |
|
| 616 |
We don't release the base model, because its Fill-in-the-Middle (FIM) capability likes to repeat itself too much, so
|
| 617 |
-
its practical use is limited. But if you still want it, write us a message on
|
| 618 |
|
| 619 |
|
| 620 |
# Finetuning
|
|
@@ -633,7 +632,7 @@ The former is likely finished, so the model tries to come up with a suggestion t
|
|
| 633 |
You are likely to have half-written code as you work on it, there is no single addition that can repair it
|
| 634 |
fully.
|
| 635 |
|
| 636 |
-
In practice, model needs to have a tendency to stop after a couple of lines added, and sometimes don't write
|
| 637 |
anything at all. We found that just giving it empty completions, single line completions, multiline
|
| 638 |
completions that end with a smaller text indent or at least a newline -- makes it much more usable. This data
|
| 639 |
was used as the rest 85% of the finetune dataset.
|
|
|
|
| 566 |
|
| 567 |
# Refact-1.6B
|
| 568 |
|
| 569 |
+
Finally, the model we started training with our [blog post](https://refact.ai/blog/2023/applying-recent-innovations-to-train-model/) is ready 🎉
|
|
|
|
| 570 |
|
| 571 |
After fine-tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats
|
| 572 |
StarCoder ten times the size!
|
|
|
|
| 613 |
The text to code proportion was 50:50, model trained for 1.2T tokens.
|
| 614 |
|
| 615 |
We don't release the base model, because its Fill-in-the-Middle (FIM) capability likes to repeat itself too much, so
|
| 616 |
+
its practical use is limited. But if you still want it, write us a message on Discord.
|
| 617 |
|
| 618 |
|
| 619 |
# Finetuning
|
|
|
|
| 632 |
You are likely to have half-written code as you work on it, there is no single addition that can repair it
|
| 633 |
fully.
|
| 634 |
|
| 635 |
+
In practice, model needs to have a tendency to stop after a couple of lines are added, and sometimes don't write
|
| 636 |
anything at all. We found that just giving it empty completions, single line completions, multiline
|
| 637 |
completions that end with a smaller text indent or at least a newline -- makes it much more usable. This data
|
| 638 |
was used as the rest 85% of the finetune dataset.
|