feat: model answers from own knowledge first, web results expand after (parallel execution) a79cc83 verified thagnitti commited on 4 days ago
perf+fix: drop runPythonScore call (saves 5-12s), keep source-weight+relevance scoring, penalise news language for general answers 4a17383 verified thagnitti commited on 4 days ago
feat: Jaccard semantic deduplication — filters near-duplicate facts across Wikipedia/DuckDuckGo/Wikidata 53ae79c verified thagnitti commited on 4 days ago
fix: strong sentence quality filter (citation artifacts, fragments, meta-language) + combined perplexity+relevance ranking 050bf3a verified thagnitti commited on 4 days ago
fix: modelDriveAnswer uses distilgpt2 only for scoring, returns real web sentences — no hallucination e2c140a verified thagnitti commited on 4 days ago
fix: QA prompt format (Context/Q/A) so distilgpt2 answers instead of hallucinating 73e0ac3 verified thagnitti commited on 4 days ago
fix: modelDriveAnswer no longer times out — nSamples 1, maxTokens 50/40, pool 20, coherence 0.05 e04deb7 verified thagnitti commited on 4 days ago
feat: migrate HST Crystalline v8 search engine to io (200+ sources, distilgpt2 refinement) f22896b verified thagnitti commited on 4 days ago
feat: replace 6.2GB model with distilgpt2 (82M, INT8, instant CPU inference) d3ae500 verified thagnitti commited on 4 days ago
feat: add HST knowledge distillation script (never overwrites original 6.2GB checkpoint) 70942cb verified thagnitti commited on 4 days ago
perf: INT8 dynamic quantization (5.8GB→~1.5GB, ~2-3x faster CPU inference) + diag logging 48adf8f verified thagnitti commited on 4 days ago
diag: log both decode methods side-by-side to find root cause of garbled output ea9e903 verified thagnitti commited on 5 days ago
fix: pass attention_mask to model.generate() — matches Colab **inp exactly f691750 verified thagnitti commited on 5 days ago
fix: decode full sequence like Colab to eliminate BPE boundary garbage 60a42a6 verified thagnitti commited on 5 days ago
fix: match Colab Cell 10 params + extend CPU timeouts to 8 min 80c4b37 verified thagnitti commited on 5 days ago
cleanup: strip dead code (web-scraping, fine-tuning, scoring) 9b2f353 verified thagnitti commited on 5 days ago
perf: greedy decoding (temp=0.1), 60 tokens — fit within CPU budget, no more timeouts 6b0e0f3 verified thagnitti commited on 5 days ago
fix: place io_logo.png in attached_assets/ where @assets alias resolves 6661020 verified thagnitti commited on 5 days ago
feat: AI-only mode — remove all web search, model answers every query directly c9ff16e verified thagnitti commited on 5 days ago
perf: greedy decoding + inference_mode for coherent fast answers on CIF path 8ca39a4 verified thagnitti commited on 5 days ago
fix: ThreadingHTTPServer so health checks never block during inference + all CPU threads for generation 94c87ce verified thagnitti commited on 5 days ago
fix: load model in fp16 + free checkpoint dict immediately to stay under 16GB RAM limit c93591b verified thagnitti commited on 5 days ago
fix: increase inference timeout 30s->180s, reduce nSamples 3->1 and maxTokens 400->100 for CPU feasibility 47ceb69 verified thagnitti commited on 5 days ago
fix: pass attention_mask explicitly to suppress generation warning 85c8c95 verified thagnitti commited on 5 days ago
fix: auto-download correct 6.2GB checkpoint if cached file is stale/wrong size 6a4b040 verified thagnitti commited on 5 days ago
fix: complete model server HTTP handler, remove duplicate daemon, extend health timeout to 900s 93b88f0 verified thagnitti commited on 5 days ago