Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.2.0
DynamicCache Error Fix - Quick Summary
Problem
ERROR: Local model error: 'DynamicCache' object has no attribute 'seen_tokens'
Result: Quality Score 0.00 for all transcripts, no analysis extracted.
Root Cause
Version incompatibility in transformers library's caching mechanism during model generation.
β Fixes Applied
1. Code Fix (llm.py)
Added use_cache=False parameter to disable problematic caching:
outputs = query_llm_local.model.generate(
**inputs,
max_new_tokens=max_tokens,
temperature=temperature,
do_sample=temperature > 0,
pad_token_id=query_llm_local.tokenizer.eos_token_id,
use_cache=False # β Fixes DynamicCache error
)
Trade-off: ~10-20% slower generation, but error-free.
2. Enhanced Error Handling
- Better error messages with specific guidance
- Automatic detection of DynamicCache issues
- Recommendations for next steps
3. Diagnostic Tool
Created fix_local_model.py to diagnose and resolve issues automatically.
π Recommended Actions (Pick One)
Option A: Upgrade Transformers (Quick Fix)
pip install --upgrade transformers
python -c "import transformers; print(transformers.__version__)"
Expected: Version 4.36.0 or higher
Option B: Use HuggingFace API (Easiest)
# Get token from: https://huggingface.co/settings/tokens
export HUGGINGFACE_TOKEN='hf_your_token_here'
export USE_HF_API=True
Option C: Use LMStudio (Best for Offline)
- Download: https://lmstudio.ai/
- Install and start server
- Set environment:
export USE_LMSTUDIO=True
export LMSTUDIO_URL=http://localhost:1234
Option D: Run Diagnostic
python fix_local_model.py
Automatically detects and guides you through fixes.
Verification
After applying any fix, test:
python -c "from llm import query_llm_local; print(query_llm_local('Test', max_tokens=10))"
Success: Returns text (not error message) Still failing: Try Option B or C above
Files Modified/Created
β Modified:
llm.py- Added use_cache=False and better error handlingrequirements.txt- Added version compatibility notes
β Created:
fix_local_model.py- Diagnostic and fix scriptTROUBLESHOOTING_DYNAMIC_CACHE.md- Comprehensive guide (13KB)DYNAMIC_CACHE_FIX_SUMMARY.md- This quick reference
Next Steps
- Choose a solution (A, B, C, or D above)
- Apply the fix
- Restart your application
- Process a test transcript
- Verify Quality Score > 0.00
If issues persist, see TROUBLESHOOTING_DYNAMIC_CACHE.md for detailed guidance.
Quick Reference
| Issue | Fix |
|---|---|
| Quality Score 0.00 | LLM is failing - apply fixes above |
| DynamicCache error | use_cache=False (already applied) + upgrade transformers |
| Slow processing | Use HF API (Option B) for speed |
| Offline required | Use LMStudio (Option C) |
| Not sure what to do | Run diagnostic (Option D) |
Support
- Full troubleshooting: See
TROUBLESHOOTING_DYNAMIC_CACHE.md - Run diagnostic:
python fix_local_model.py - Check enhancements: See
ENHANCEMENTS.md
β The code fix is already applied - you just need to upgrade dependencies or switch backends!