Spaces:
Running
Running
Update blog.html
Browse files
blog.html
CHANGED
|
@@ -268,6 +268,15 @@
|
|
| 268 |
<div class="container">
|
| 269 |
|
| 270 |
<div class="blog-grid">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 271 |
<a href="Glint-1.3 Is Live And It Is Just A Transformer Doing Its Best.html" class="blog-card">
|
| 272 |
<div class="blog-meta">
|
| 273 |
<span class="blog-date">2026-05-12</span>
|
|
|
|
| 268 |
<div class="container">
|
| 269 |
|
| 270 |
<div class="blog-grid">
|
| 271 |
+
<a href="Weird Training Idea: Embeddings First, Everything Else Later.html" class="blog-card">
|
| 272 |
+
<div class="blog-meta">
|
| 273 |
+
<span class="blog-date">2026-05-13</span>
|
| 274 |
+
<span class="blog-tag">Team Updates</span>
|
| 275 |
+
</div>
|
| 276 |
+
<h2>Weird Training Idea: Embeddings First, Everything Else</h2>
|
| 277 |
+
<p>I have a weird idea. It might be stupid. It might be brilliant. I do not know yet. The idea is simple. Instead of training the entire model at once, train the embedding layers first. Then train the rest of the model to utilize those embeddings. Two stages. Two objectives. One hope.</p>
|
| 278 |
+
<span class="blog-read-more">Read more</span>
|
| 279 |
+
</a>
|
| 280 |
<a href="Glint-1.3 Is Live And It Is Just A Transformer Doing Its Best.html" class="blog-card">
|
| 281 |
<div class="blog-meta">
|
| 282 |
<span class="blog-date">2026-05-12</span>
|