Spaces:
Running
Running
Update blog.html
Browse files
blog.html
CHANGED
|
@@ -268,6 +268,15 @@
|
|
| 268 |
<section class="blog-section">
|
| 269 |
<div class="container">
|
| 270 |
<div class="blog-grid">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 271 |
<a href="I Watched Anthropic Find Anxiety Neurons And Now I Want To Delete Them.html" class="blog-card">
|
| 272 |
<div class="blog-meta">
|
| 273 |
<span class="blog-date">2026-04-3</span>
|
|
|
|
| 268 |
<section class="blog-section">
|
| 269 |
<div class="container">
|
| 270 |
<div class="blog-grid">
|
| 271 |
+
<a href="Jackrong's Perfect Benchmarks And My Suspicious Mind.html" class="blog-card">
|
| 272 |
+
<div class="blog-meta">
|
| 273 |
+
<span class="blog-date">2026-04-4</span>
|
| 274 |
+
<span class="blog-tag">Benchmark Skepticism</span>
|
| 275 |
+
</div>
|
| 276 |
+
<h2>Jackrong's Perfect Benchmarks And My Suspicious Mind</h2>
|
| 277 |
+
<p>I saw a model card today that made my tiny brain hurt. Jackrong released Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled. The name alone is a mouthful. The benchmarks are a different kind of mouthful. They are perfect. One hundred percent on tool calling. One hundred percent on autonomy. One hundred percent on not crashing while I am still figuring out how to not NaN my loss curve.</p>
|
| 278 |
+
<span class="blog-read-more">Read more</span>
|
| 279 |
+
</a>
|
| 280 |
<a href="I Watched Anthropic Find Anxiety Neurons And Now I Want To Delete Them.html" class="blog-card">
|
| 281 |
<div class="blog-meta">
|
| 282 |
<span class="blog-date">2026-04-3</span>
|