| <!doctype html> |
| <html> |
| <head> |
| <meta charset="utf-8" /> |
| <meta name="viewport" content="width=device-width" /> |
| <title>Single-Shot Brevity Training | LLM Response Optimization</title> |
| <link rel="stylesheet" href="style.css" /> |
| </head> |
| <body> |
| <div class="container"> |
| <header> |
| <h1>Single-Shot Brevity Training</h1> |
| <p class="subtitle">Using One Example to Train LLMs for Informational Brevity</p> |
| <div class="links"> |
| <a href="https://github.com/danielrosehill/Single-Shot-Brevity-Training" target="_blank" class="btn">View on GitHub</a> |
| </div> |
| </header> |
|
|
| <section class="card"> |
| <h2>The Problem</h2> |
| <p>Large Language Models often generate excessively verbose responses, even when concise, informative answers would be more valuable. This experiment explores a simple yet effective approach to guide models toward brevity without sacrificing information quality.</p> |
| </section> |
|
|
| <section class="card"> |
| <h2>The Approach</h2> |
| <p>Rather than abstract instructions like "be concise," this framework uses <strong>single-shot training</strong>: demonstrating the desired format with one concrete example in the system prompt.</p> |
|
|
| <h3>Two-Phase Methodology</h3> |
| <div class="phase"> |
| <h4>Phase 1: Baseline Evaluation</h4> |
| <p>Tested 14 models using a standardized product recommendation prompt (power bank selection) without any brevity instructions to establish natural response lengths.</p> |
| </div> |
|
|
| <div class="phase"> |
| <h4>Phase 2: Single-Shot Training</h4> |
| <p>Selected models received system prompts containing one optimized response example to guide future outputs toward similar brevity.</p> |
| </div> |
| </section> |
|
|
| <section class="card highlight"> |
| <h2>Key Findings</h2> |
|
|
| <div class="stat-grid"> |
| <div class="stat"> |
| <div class="stat-number">5.5x</div> |
| <div class="stat-label">Difference between longest and shortest responses</div> |
| </div> |
| <div class="stat"> |
| <div class="stat-number">794</div> |
| <div class="stat-label">Mean response length (words)</div> |
| </div> |
| <div class="stat"> |
| <div class="stat-number">60-75%</div> |
| <div class="stat-label">Word reduction in optimized examples</div> |
| </div> |
| </div> |
|
|
| <h3>Model Response Length Comparison</h3> |
| <div class="chart-container"> |
| <img src="verbosity_bar_chart.png" alt="Bar chart comparing word counts across 14 LLM models" class="chart-image"> |
| <p class="chart-caption">Comparison of response lengths across 14 evaluated models</p> |
| </div> |
|
|
| <h3>Comprehensive Verbosity Analysis</h3> |
| <div class="chart-container"> |
| <img src="verbosity_analysis.png" alt="Four-panel analysis of response verbosity characteristics" class="chart-image"> |
| <p class="chart-caption">Multi-faceted examination of response characteristics and patterns</p> |
| </div> |
|
|
| <h3>Response Length Variation</h3> |
| <ul> |
| <li><strong>Longest:</strong> 1,632 words (OpenAI GPT-OSS-120B)</li> |
| <li><strong>Shortest:</strong> 295 words (AI21 Jamba Large)</li> |
| <li><strong>Standard deviation:</strong> 456 words</li> |
| </ul> |
|
|
| <h3>Most Concise Performers</h3> |
| <ol class="model-list"> |
| <li><strong>AI21 Jamba Large</strong> - 295 words</li> |
| <li><strong>Mistral Large</strong> - 352 words</li> |
| <li><strong>Meta Llama 4 Maverick</strong> - 397 words</li> |
| </ol> |
|
|
| <h3>Most Verbose Performers</h3> |
| <ol class="model-list"> |
| <li><strong>OpenAI GPT-OSS-120B</strong> - 1,632 words</li> |
| <li><strong>Google Gemini 2.5 Flash</strong> - 1,607 words</li> |
| </ol> |
| </section> |
|
|
| <section class="card"> |
| <h2>Repository Contents</h2> |
| <ul> |
| <li><strong>Raw Response Data:</strong> Complete baseline outputs from all tested models</li> |
| <li><strong>Optimized Examples:</strong> Demonstrating ideal brevity (60-75% word reduction)</li> |
| <li><strong>Model-Specific System Prompts:</strong> Implementing single-shot training for practical application</li> |
| <li><strong>Statistical Analysis:</strong> Comprehensive comparison of response lengths and patterns</li> |
| </ul> |
| </section> |
|
|
| <section class="card"> |
| <h2>Practical Applications</h2> |
| <p>This approach offers several benefits for LLM deployment:</p> |
| <ul> |
| <li><strong>Cost Reduction:</strong> Shorter responses mean fewer output tokens and lower API costs</li> |
| <li><strong>User Experience:</strong> Concise responses are faster to read and process</li> |
| <li><strong>Efficiency:</strong> One example is simpler than complex prompt engineering</li> |
| <li><strong>Reusability:</strong> The framework can be adapted to different use cases and domains</li> |
| </ul> |
| </section> |
|
|
| <section class="card"> |
| <h2>Get Involved</h2> |
| <p>This is an open experiment exploring effective LLM training techniques. The repository includes all data, prompts, and analysis for transparency and reproducibility.</p> |
| <div class="links"> |
| <a href="https://github.com/danielrosehill/Single-Shot-Brevity-Training" target="_blank" class="btn btn-primary">Explore the Repository</a> |
| <a href="https://github.com/danielrosehill/Single-Shot-Brevity-Training/issues" target="_blank" class="btn">Share Feedback</a> |
| </div> |
| </section> |
|
|
| <footer> |
| <p>Created by <a href="https://danielrosehill.com" target="_blank">Daniel Rosehill</a></p> |
| <p>Part of ongoing research in LLM optimization and prompt engineering</p> |
| </footer> |
| </div> |
| </body> |
| </html> |
|
|