Spaces:

danielrosehill
/

Single-Shot-Brevity-Training

Running

App Files Files Community

Single-Shot-Brevity-Training / index.html

danielrosehill

Create comprehensive Hugging Face Space for Single-Shot Brevity Training experiment

50271b5 6 months ago

raw

history blame contribute delete

5.53 kB

	<!doctype html>
	<html>
	<head>
	<meta charset="utf-8" />
	<meta name="viewport" content="width=device-width" />
	<title>Single-Shot Brevity Training \| LLM Response Optimization</title>
	<link rel="stylesheet" href="style.css" />
	</head>
	<body>
	<div class="container">
	<header>
	<h1>Single-Shot Brevity Training</h1>
	<p class="subtitle">Using One Example to Train LLMs for Informational Brevity</p>
	<div class="links">
	<a href="https://github.com/danielrosehill/Single-Shot-Brevity-Training" target="_blank" class="btn">View on GitHub</a>
	</div>
	</header>

	<section class="card">
	<h2>The Problem</h2>
	<p>Large Language Models often generate excessively verbose responses, even when concise, informative answers would be more valuable. This experiment explores a simple yet effective approach to guide models toward brevity without sacrificing information quality.</p>
	</section>

	<section class="card">
	<h2>The Approach</h2>
	<p>Rather than abstract instructions like "be concise," this framework uses <strong>single-shot training</strong>: demonstrating the desired format with one concrete example in the system prompt.</p>

	<h3>Two-Phase Methodology</h3>
	<div class="phase">
	<h4>Phase 1: Baseline Evaluation</h4>
	<p>Tested 14 models using a standardized product recommendation prompt (power bank selection) without any brevity instructions to establish natural response lengths.</p>
	</div>

	<div class="phase">
	<h4>Phase 2: Single-Shot Training</h4>
	<p>Selected models received system prompts containing one optimized response example to guide future outputs toward similar brevity.</p>
	</div>
	</section>

	<section class="card highlight">
	<h2>Key Findings</h2>

	<div class="stat-grid">
	<div class="stat">
	<div class="stat-number">5.5x</div>
	<div class="stat-label">Difference between longest and shortest responses</div>
	</div>
	<div class="stat">
	<div class="stat-number">794</div>
	<div class="stat-label">Mean response length (words)</div>
	</div>
	<div class="stat">
	<div class="stat-number">60-75%</div>
	<div class="stat-label">Word reduction in optimized examples</div>
	</div>
	</div>

	<h3>Model Response Length Comparison</h3>
	<div class="chart-container">
	<img src="verbosity_bar_chart.png" alt="Bar chart comparing word counts across 14 LLM models" class="chart-image">
	<p class="chart-caption">Comparison of response lengths across 14 evaluated models</p>
	</div>

	<h3>Comprehensive Verbosity Analysis</h3>
	<div class="chart-container">
	<img src="verbosity_analysis.png" alt="Four-panel analysis of response verbosity characteristics" class="chart-image">
	<p class="chart-caption">Multi-faceted examination of response characteristics and patterns</p>
	</div>

	<h3>Response Length Variation</h3>
	<ul>
	<li><strong>Longest:</strong> 1,632 words (OpenAI GPT-OSS-120B)</li>
	<li><strong>Shortest:</strong> 295 words (AI21 Jamba Large)</li>
	<li><strong>Standard deviation:</strong> 456 words</li>
	</ul>

	<h3>Most Concise Performers</h3>
	<ol class="model-list">
	<li><strong>AI21 Jamba Large</strong> - 295 words</li>
	<li><strong>Mistral Large</strong> - 352 words</li>
	<li><strong>Meta Llama 4 Maverick</strong> - 397 words</li>
	</ol>

	<h3>Most Verbose Performers</h3>
	<ol class="model-list">
	<li><strong>OpenAI GPT-OSS-120B</strong> - 1,632 words</li>
	<li><strong>Google Gemini 2.5 Flash</strong> - 1,607 words</li>
	</ol>
	</section>

	<section class="card">
	<h2>Repository Contents</h2>
	<ul>
	<li><strong>Raw Response Data:</strong> Complete baseline outputs from all tested models</li>
	<li><strong>Optimized Examples:</strong> Demonstrating ideal brevity (60-75% word reduction)</li>
	<li><strong>Model-Specific System Prompts:</strong> Implementing single-shot training for practical application</li>
	<li><strong>Statistical Analysis:</strong> Comprehensive comparison of response lengths and patterns</li>
	</ul>
	</section>

	<section class="card">
	<h2>Practical Applications</h2>
	<p>This approach offers several benefits for LLM deployment:</p>
	<ul>
	<li><strong>Cost Reduction:</strong> Shorter responses mean fewer output tokens and lower API costs</li>
	<li><strong>User Experience:</strong> Concise responses are faster to read and process</li>
	<li><strong>Efficiency:</strong> One example is simpler than complex prompt engineering</li>
	<li><strong>Reusability:</strong> The framework can be adapted to different use cases and domains</li>
	</ul>
	</section>

	<section class="card">
	<h2>Get Involved</h2>
	<p>This is an open experiment exploring effective LLM training techniques. The repository includes all data, prompts, and analysis for transparency and reproducibility.</p>
	<div class="links">
	<a href="https://github.com/danielrosehill/Single-Shot-Brevity-Training" target="_blank" class="btn btn-primary">Explore the Repository</a>
	<a href="https://github.com/danielrosehill/Single-Shot-Brevity-Training/issues" target="_blank" class="btn">Share Feedback</a>
	</div>
	</section>

	<footer>
	<p>Created by <a href="https://danielrosehill.com" target="_blank">Daniel Rosehill</a></p>
	<p>Part of ongoing research in LLM optimization and prompt engineering</p>
	</footer>
	</div>
	</body>
	</html>