TranscriptWriting / DYNAMIC_CACHE_FIX_SUMMARY.md
jmisak's picture
Upload 5 files
93c98b5 verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

DynamicCache Error Fix - Quick Summary

Problem

ERROR: Local model error: 'DynamicCache' object has no attribute 'seen_tokens'

Result: Quality Score 0.00 for all transcripts, no analysis extracted.


Root Cause

Version incompatibility in transformers library's caching mechanism during model generation.


βœ… Fixes Applied

1. Code Fix (llm.py)

Added use_cache=False parameter to disable problematic caching:

outputs = query_llm_local.model.generate(
    **inputs,
    max_new_tokens=max_tokens,
    temperature=temperature,
    do_sample=temperature > 0,
    pad_token_id=query_llm_local.tokenizer.eos_token_id,
    use_cache=False  # ← Fixes DynamicCache error
)

Trade-off: ~10-20% slower generation, but error-free.

2. Enhanced Error Handling

  • Better error messages with specific guidance
  • Automatic detection of DynamicCache issues
  • Recommendations for next steps

3. Diagnostic Tool

Created fix_local_model.py to diagnose and resolve issues automatically.


πŸš€ Recommended Actions (Pick One)

Option A: Upgrade Transformers (Quick Fix)

pip install --upgrade transformers
python -c "import transformers; print(transformers.__version__)"

Expected: Version 4.36.0 or higher

Option B: Use HuggingFace API (Easiest)

# Get token from: https://huggingface.co/settings/tokens
export HUGGINGFACE_TOKEN='hf_your_token_here'
export USE_HF_API=True

Option C: Use LMStudio (Best for Offline)

  1. Download: https://lmstudio.ai/
  2. Install and start server
  3. Set environment:
export USE_LMSTUDIO=True
export LMSTUDIO_URL=http://localhost:1234

Option D: Run Diagnostic

python fix_local_model.py

Automatically detects and guides you through fixes.


Verification

After applying any fix, test:

python -c "from llm import query_llm_local; print(query_llm_local('Test', max_tokens=10))"

Success: Returns text (not error message) Still failing: Try Option B or C above


Files Modified/Created

βœ… Modified:

  • llm.py - Added use_cache=False and better error handling
  • requirements.txt - Added version compatibility notes

βœ… Created:

  • fix_local_model.py - Diagnostic and fix script
  • TROUBLESHOOTING_DYNAMIC_CACHE.md - Comprehensive guide (13KB)
  • DYNAMIC_CACHE_FIX_SUMMARY.md - This quick reference

Next Steps

  1. Choose a solution (A, B, C, or D above)
  2. Apply the fix
  3. Restart your application
  4. Process a test transcript
  5. Verify Quality Score > 0.00

If issues persist, see TROUBLESHOOTING_DYNAMIC_CACHE.md for detailed guidance.


Quick Reference

Issue Fix
Quality Score 0.00 LLM is failing - apply fixes above
DynamicCache error use_cache=False (already applied) + upgrade transformers
Slow processing Use HF API (Option B) for speed
Offline required Use LMStudio (Option C)
Not sure what to do Run diagnostic (Option D)

Support

  • Full troubleshooting: See TROUBLESHOOTING_DYNAMIC_CACHE.md
  • Run diagnostic: python fix_local_model.py
  • Check enhancements: See ENHANCEMENTS.md

βœ… The code fix is already applied - you just need to upgrade dependencies or switch backends!