Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Makeshift MTP | TinyMemoryLM</title> | |
| <link rel="preconnect" href="https://fonts.googleapis.com"> | |
| <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> | |
| <link href="https://fonts.googleapis.com/css2?family=Geist:wght@400;500;600;700&family=Geist+Mono&display=swap" rel="stylesheet"> | |
| <style> | |
| :root { --black: #000000; --black-soft: #0a0a0a; --gray-1: #171717; --gray-2: #262626; --gray-5: #737373; --gray-6: #a3a3a6; --gray-7: #d4d4d4; --white: #ffffff; --accent: #ff4d00; --font-sans: 'Geist', -apple-system, sans-serif; --container-max: 700px; } | |
| * { box-sizing: border-box; margin: 0; padding: 0; } | |
| body { font-family: var(--font-sans); background: var(--black); color: var(--gray-7); line-height: 1.7; } | |
| a { color: var(--white); text-decoration: none; } | |
| a:hover { color: var(--accent); } | |
| .container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; } | |
| nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.8); backdrop-filter: blur(12px); border-bottom: 1px solid var(--gray-2); padding: 16px 0; } | |
| nav .container { display: flex; justify-content: space-between; align-items: center; } | |
| .nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; } | |
| .nav-brand span { color: var(--accent); } | |
| .nav-links { display: flex; gap: 32px; } | |
| .nav-links a { font-size: 14px; font-weight: 500; color: var(--gray-6); } | |
| .post { padding: 140px 0 80px; } | |
| .post-back { display: inline-block; color: var(--gray-5); font-size: 14px; margin-bottom: 32px; } | |
| .post-back:hover { color: var(--accent); } | |
| .post-back::before { content: '← '; } | |
| .post-meta { display: flex; gap: 12px; margin-bottom: 20px; } | |
| .post-date { font-size: 13px; color: var(--gray-5); } | |
| .post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; color: var(--accent); background: rgba(255, 77, 0, 0.1); padding: 4px 10px; border-radius: 4px; } | |
| .post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; } | |
| .post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--gray-6); } | |
| .post-body p:first-of-type { font-size: 20px; color: var(--gray-7); } | |
| .post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--gray-2); } | |
| .post-footer p { font-size: 14px; color: var(--gray-5); font-style: italic; margin: 0; } | |
| footer { padding: 40px 0; background: var(--black-soft); border-top: 1px solid var(--gray-2); text-align: center; } | |
| footer p { color: var(--gray-5); font-size: 14px; margin-bottom: 8px; } | |
| @media (max-width: 768px) { .post h1 { font-size: 28px; } } | |
| </style> | |
| </head> | |
| <body> | |
| <nav> | |
| <div class="container"> | |
| <a href="index.html" class="nav-brand"><span>/</span>TinyMemoryLM</a> | |
| <div class="nav-links"> | |
| <a href="index.html">Home</a> | |
| <a href="blog.html">Blog</a> | |
| <a href="status.html">Status</a> | |
| </div> | |
| </div> | |
| </nav> | |
| <main> | |
| <article class="post"> | |
| <div class="container"> | |
| <a href="blog.html" class="post-back">Back to Blog</a> | |
| <header> | |
| <div class="post-meta"> | |
| <span class="post-date">2026-02-17</span> | |
| <span class="post-tag">MTP</span> | |
| </div> | |
| <h1>Makeshift MTP: Predicting the Future on a Budget</h1> | |
| </header> | |
| <div class="post-body"> | |
| <p>Multi-token prediction sounds fancy. Really it is just the model trying to do its homework before the teacher assigns it. Sometimes it works. Sometimes it does not. But it always tries.</p> | |
| <p>The idea is simple: instead of predicting one token at a time, predict multiple tokens ahead. During training, we learn to predict tokens at positions t+1, t+2, t+3, and so on. Then during inference, we can either use all these predictions or pick the best one.</p> | |
| <p>We call it "makeshift" because it is not the elegant solution. The elegant solution would be a model that inherently understands sequence. But we are working with what we have, which is a transformer that mostly just wants to predict the next word and occasionally surprise us.</p> | |
| </div> | |
| <footer class="post-footer"> | |
| <p>Current status: MTP is happening. The model is trying to see the future. Mostly it sees gibberish. But it tries.</p> | |
| </footer> | |
| </div> | |
| </article> | |
| </main> | |
| <footer> | |
| <div class="container"> | |
| <p>Built with curiosity over compute</p> | |
| <p>TinyMemoryLM by AILAY | 2026</p> | |
| </div> | |
| </footer> | |
| </body> | |
| </html> | |