Spaces:

empirenexus
/

TranscriptWriting

Sleeping

App Files Files Community

TranscriptWriting / DYNAMIC_CACHE_FIX_SUMMARY.md

jmisak

Upload 5 files

93c98b5 verified 2 months ago

preview code

raw

history blame contribute delete

3.44 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

DynamicCache Error Fix - Quick Summary

Problem

ERROR: Local model error: 'DynamicCache' object has no attribute 'seen_tokens'

Result: Quality Score 0.00 for all transcripts, no analysis extracted.

Root Cause

Version incompatibility in transformers library's caching mechanism during model generation.

✅ Fixes Applied

1. Code Fix (llm.py)

Added use_cache=False parameter to disable problematic caching:

outputs = query_llm_local.model.generate(
    **inputs,
    max_new_tokens=max_tokens,
    temperature=temperature,
    do_sample=temperature > 0,
    pad_token_id=query_llm_local.tokenizer.eos_token_id,
    use_cache=False  # ← Fixes DynamicCache error
)

Trade-off: ~10-20% slower generation, but error-free.

2. Enhanced Error Handling

Better error messages with specific guidance
Automatic detection of DynamicCache issues
Recommendations for next steps

3. Diagnostic Tool

Created fix_local_model.py to diagnose and resolve issues automatically.

🚀 Recommended Actions (Pick One)

Option A: Upgrade Transformers (Quick Fix)

pip install --upgrade transformers
python -c "import transformers; print(transformers.__version__)"

Expected: Version 4.36.0 or higher

Option B: Use HuggingFace API (Easiest)

# Get token from: https://huggingface.co/settings/tokens
export HUGGINGFACE_TOKEN='hf_your_token_here'
export USE_HF_API=True

Option C: Use LMStudio (Best for Offline)

Download: https://lmstudio.ai/
Install and start server
Set environment:

export USE_LMSTUDIO=True
export LMSTUDIO_URL=http://localhost:1234

Option D: Run Diagnostic

python fix_local_model.py

Automatically detects and guides you through fixes.

Verification

After applying any fix, test:

python -c "from llm import query_llm_local; print(query_llm_local('Test', max_tokens=10))"

Success: Returns text (not error message) Still failing: Try Option B or C above

Files Modified/Created

✅ Modified:

llm.py - Added use_cache=False and better error handling
requirements.txt - Added version compatibility notes

✅ Created:

fix_local_model.py - Diagnostic and fix script
TROUBLESHOOTING_DYNAMIC_CACHE.md - Comprehensive guide (13KB)
DYNAMIC_CACHE_FIX_SUMMARY.md - This quick reference

Next Steps

Choose a solution (A, B, C, or D above)
Apply the fix
Restart your application
Process a test transcript
Verify Quality Score > 0.00

If issues persist, see TROUBLESHOOTING_DYNAMIC_CACHE.md for detailed guidance.

Quick Reference

Issue	Fix
Quality Score 0.00	LLM is failing - apply fixes above
DynamicCache error	use_cache=False (already applied) + upgrade transformers
Slow processing	Use HF API (Option B) for speed
Offline required	Use LMStudio (Option C)
Not sure what to do	Run diagnostic (Option D)

Support

Full troubleshooting: See TROUBLESHOOTING_DYNAMIC_CACHE.md
Run diagnostic: python fix_local_model.py
Check enhancements: See ENHANCEMENTS.md

✅ The code fix is already applied - you just need to upgrade dependencies or switch backends!