Upgrade to Llama 3.1 8B-Instruct for better long-form content 6ea58d5 david167 commited on Aug 20, 2025
Speed optimizations: Switch to Mistral-7B + optimize generation params fac0be2 david167 commited on Aug 19, 2025
Major update: Add NFL training data generation and improve model handling 992eedb david167 commited on Aug 7, 2025
Force all CUDA operations to cuda:0 and use device_map to prevent multi-GPU distribution 01a04bc david167 commited on Aug 6, 2025
Fix multi-GPU device placement error: disable device_map auto and ensure tensors on same device 0331461 david167 commited on Aug 6, 2025
Switch to Llama-3.1-8B-Instruct: update model loading, prompts, and generation parameters 8b5e9db david167 commited on Aug 6, 2025
Switch to FLAN-T5-Large: uses standard HF storage, excellent for question generation 203ee8d david167 commited on Aug 6, 2025
Fix XetHub issue: use Meta-Llama-3.1-8B-Instruct (official HF storage) 444b4d9 david167 commited on Aug 6, 2025
Fix download errors and warnings: retry logic + clean startup logs de72460 david167 commited on Aug 6, 2025
Fix version conflict: use PyTorch 2.5.0 + TorchVision 0.20.0 exact match 67f9bcb david167 commited on Aug 6, 2025
Use PyTorch 2.5.1 with safetensors to avoid CVE-2025-32434 - practical fix c6d6f6c david167 commited on Aug 6, 2025
Fix PyTorch CVE-2025-32434: upgrade to v2.6+, use safetensors, restore Llama 3.1 adea437 david167 commited on Aug 6, 2025
Deploy temporary working model: DialoGPT while awaiting Llama 3.1 access 23353f8 david167 commited on Aug 6, 2025
Fix permissions error: proper cache directory and HF token auth for Llama 3.1 bd1b0d6 david167 commited on Aug 6, 2025
Major optimization: Replace llama-cpp-python with transformers to eliminate compilation 4656a02 david167 commited on Aug 6, 2025
Initial setup: Question Generation API with DeepHermes reasoning model 0bf99b7 david167 commited on Aug 6, 2025