Spaces:

Luigi
/

tiny-scribe

Running

Luigi commited on Feb 5

Commit

c4c1bc0

1 Parent(s): 686676d

fix: reserve tokens for system prompt and generation in extraction windows

Window size was using full n_ctx (4096), causing overflow when system
prompt and generation tokens were added. Now reserves 1500 tokens
(~200 for prompt + ~1024 for output + safety margin).

Max window tokens: 4096 - 1500 = 2596 tokens

Fixes: ValueError: Requested tokens (4796) exceed context window of 4096

Files changed (1) hide show

app.py +4 -1

app.py CHANGED Viewed

@@ -1482,6 +1482,9 @@ def summarize_advanced(
         # In production, this would be more sophisticated
         lines = [l.strip() for l in transcript.split('\n') if l.strip()]
         # Simple windowing: split into chunks based on token count
         windows = []
         current_window = []
@@ -1491,7 +1494,7 @@ def summarize_advanced(
         for line_num, line in enumerate(lines):
             line_tokens = tokenizer.count(line)
-            if current_tokens + line_tokens > extraction_n_ctx and current_window:
                 # Create window
                 window_content = '\n'.join(current_window)
                 windows.append(Window(

         # In production, this would be more sophisticated
         lines = [l.strip() for l in transcript.split('\n') if l.strip()]
+        # Reserve tokens for system prompt (~200) and output (~1024)
+        max_window_tokens = extraction_n_ctx - 1500  # Safe buffer for prompts and generation
         # Simple windowing: split into chunks based on token count
         windows = []
         current_window = []
         for line_num, line in enumerate(lines):
             line_tokens = tokenizer.count(line)
+            if current_tokens + line_tokens > max_window_tokens and current_window:
                 # Create window
                 window_content = '\n'.join(current_window)
                 windows.append(Window(