repurpose: HelpingAI/HELVETE-3B (nsfw-3b-q4_k_m.gguf) 6a6a4e3 verified chmielvu commited on 17 days ago
repurpose: mradermacher/Gemma-3-Prompt-Coder-270m-it-Uncensored-GGUF (Gemma-3-Prompt-Coder-270m-it-Uncensored.Q4_K_S.gguf) 0e20523 verified chmielvu commited on 17 days ago
repurpose: HelpingAI/HELVETE-3B (nsfw-3b-q4_k_m.gguf) 3adc25f verified chmielvu commited on 17 days ago
fix: bypass from_pretrained path bug with hf_hub_download c642bd7 verified chmielvu commited on 23 days ago
fix: separate draft_model loading to avoid path construction bug d68f000 verified chmielvu commited on 23 days ago
fix: correct model filenames for SmolLM3 Q4_K_S and SmolLM2 draft model b61af84 verified chmielvu commited on 23 days ago
feat: add production refinements (Phase 1-3) 4454066 verified chmielvu Claude Sonnet 4.5 commited on 23 days ago
Fix: Gradio 6.0 compatible ChatInterface with streaming 785124e verified chmielvu commited on 26 days ago
Fix: Gradio 6.0 compatibility - remove deprecated theme param and show_copy_button 2aa6a26 verified chmielvu commited on 26 days ago
Major fix: Switch to transformers (no llama-cpp build). Use Qwen2.5-3B for fast CPU inference 371ac0a verified chmielvu commited on 26 days ago
Fix: Switch to standard llama-cpp-python package (remove editable install) 5371df2 verified chmielvu commited on 26 days ago
Fix: Use editable git install for llama-cpp-python, remove version pins c840812 verified chmielvu commited on 26 days ago
Fix: Use prebuilt llama-cpp-python wheels, pin dependencies, extend timeout da3ea1b verified chmielvu commited on 27 days ago