Upgrade to Llama 3.1 8B-Instruct for better long-form content 6ea58d5 david167 commited on Aug 20, 2025
Speed optimizations: Switch to Mistral-7B + optimize generation params fac0be2 david167 commited on Aug 19, 2025
COMPLETE API REBUILD: ZERO TRUNCATION PRINCIPLE - Intelligent extraction, generous tokens, never cut content 7822d6f david167 commited on Aug 13, 2025
ULTRA CONSERVATIVE EXTRACTION: Find JSON array boundaries properly, extensive logging, no aggressive cutting d82dc35 david167 commited on Aug 13, 2025
FIX TRUNCATION: Improved response extraction logic, conservative cutting, detailed logging - NO MORE TRUNCATION && git push f52c60e david167 commited on Aug 13, 2025
FIX TUPLE ISSUE: Return single string output instead of tuple - eliminates ('content', '') wrapper 1644c5e david167 commited on Aug 13, 2025
DIRECT CONTENT API: Return just the generated content, no API wrappers or tuples - perfect for client parsing 0460f5e david167 commited on Aug 13, 2025
ULTRA SIMPLE FIX: Remove ALL JSON components, use only Textbox inputs/outputs, no state anywhere 02333f2 david167 commited on Aug 13, 2025
BULLETPROOF API: Remove ALL State components, use JSON inputs instead, proper input/output matching, ZERO GRADIO ERRORS caf4bcb david167 commited on Aug 13, 2025
SIMPLE WORKING API: Fix Gradio interface issues, use simple Interface instead of Blocks, proper API structure 657d622 david167 commited on Aug 13, 2025
Fix Gradio interface: Remove chatbot format issues, add proper API endpoint structure b2df124 david167 commited on Aug 13, 2025
ELEGANT API REWRITE: Clean architecture, smart token allocation, proper JSON extraction - eliminate placeholder generation 0cdc4eb david167 commited on Aug 13, 2025
MAXIMUM TOKEN SETTINGS: Use 131k context, 16k max_new_tokens, 2k min_tokens for CoT - eliminate all truncation 14f445d david167 commited on Aug 13, 2025
Aggressive fix for CoT truncation: increase min_new_tokens to 1500, suppress EOS token for CoT requests, cap max_new_tokens b394386 david167 commited on Aug 13, 2025
Fix CoT truncation: increase min_new_tokens to 1000, add generation logging, improve truncated JSON handling 2e7d584 david167 commited on Aug 13, 2025
Improve Chain of Thinking support: increase min_new_tokens to 500 for CoT requests, improve JSON bracket tracking for nested objects 04a4f80 david167 commited on Aug 13, 2025
Fix response truncation - improve extraction logic to find actual content start 678e0f9 david167 commited on Aug 12, 2025
Simplify API - remove all templates, just prompt-in response-out 6bf8feb david167 commited on Aug 12, 2025
Add 'list' template for better summarization with specific content extraction 19607d6 david167 commited on Aug 12, 2025
Fix JSON templates - use instructional format instead of literal examples 07655f2 david167 commited on Aug 7, 2025
Fix response extraction - prevent truncation at beginning of JSON responses 7f68863 david167 commited on Aug 7, 2025
Fix JSON array generation - add explicit array requirements and improve JSON parsing 8860e75 david167 commited on Aug 7, 2025
Increase max_new_tokens to 8192 for unlimited length responses 1ba70a2 david167 commited on Aug 7, 2025
Improve model generation parameters and add logging - fix response truncation issues 342694d david167 commited on Aug 7, 2025
Fix JSON response templates for better prompt generation - simplified templates for more reliable JSON parsing 4ad994e david167 commited on Aug 7, 2025
Fix font visibility: Add dark text colors for better contrast 02ad4bf david167 commited on Aug 7, 2025
Fix UI: Reduce dialog size and prevent input focus layout shifts 6849ba1 david167 commited on Aug 7, 2025
Fix requirements.txt - Add missing transformers and ML dependencies 0f21de6 david167 commited on Aug 7, 2025
Major update: Add NFL training data generation and improve model handling 992eedb david167 commited on Aug 7, 2025
Update UI for full-width display on big screens with responsive design a7cf970 david167 commited on Aug 6, 2025
DEBUG: Show complete raw model output and prompt to identify clipping source 0c60639 david167 commited on Aug 6, 2025
COMPLETE REWRITE: Clean ChatGPT-style interface with proper response handling fcef7cd david167 commited on Aug 6, 2025
TEMPORARY: Show full model response for debugging clipping issue 8106bb9 david167 commited on Aug 6, 2025
Fix response truncation: disable early stopping, increase token limits to 4096, add debugging logs 4185c2a david167 commited on Aug 6, 2025
Fix response clipping: use robust assistant header detection instead of prompt length 0d85e38 david167 commited on Aug 6, 2025
Replace with simplified raw chat interface for prompt testing 625d819 david167 commited on Aug 6, 2025
Force all CUDA operations to cuda:0 and use device_map to prevent multi-GPU distribution 01a04bc david167 commited on Aug 6, 2025
Fix multi-GPU device placement error: disable device_map auto and ensure tensors on same device 0331461 david167 commited on Aug 6, 2025
Switch to Llama-3.1-8B-Instruct: update model loading, prompts, and generation parameters 8b5e9db david167 commited on Aug 6, 2025
Switch back to Llama-3.1-8B-Instruct model: update prompts, generation params, and UI descriptions e6b5afc david167 commited on Aug 6, 2025