kisan-sathi / pull_request_description.md
sxandie's picture
docs: Add pull request description markdown
7d3c8c9
|
Raw
History Blame Contribute Delete
3.86 kB

Merge Request: SQLite Persistence, Latency Fallbacks, Thread-Safe KV Cache Lock, and Mobile UI Enhancements

📌 Overview

This Merge Request delivers critical enhancements to the Kisan-Sathi application to ensure thread safety, robust offline latency management, layout responsiveness, and clean system state resets.


🚀 Key Enhancements

1. Latency & Timeout Fallbacks for local GGUF (src/llm.py)

To prevent the application from hanging on "Thinking..." when running local inference on slower CPU architectures (e.g., without AVX-512):

  • Model Loading Limit: If the model loader (init_llama_cpp()) takes more than 35 seconds to load the GGUF file from disk, it automatically falls back to the mock backend.
  • Prefill Latency Limit: If the initial prompt prefill (first token generation) takes more than 30 seconds, llama_cpp is disabled for subsequent prompts, falling back to the mock backend.
  • Optimized Thresholds: Increased timeouts from 10s/15s to 30s/35s respectively, providing slow CPUs with sufficient headroom to process prompts successfully without premature timeouts.

2. Thread-Safety Lock to Prevent KV Cache Corruption (src/llm.py)

FastAPI serves requests concurrently, which previously caused simultaneous calls to the shared in-process llama.cpp model:

  • The Issue: Simultaneous inference requests corrupted the model's KV cache, resulting in repetitive token loops (आपका2आपका2) or garbled strings (नमस्ते4आपका).
  • The Fix: Implemented a global threading.Lock (_model_lock) inside generate_llama_cpp(). Access to the local GGUF model is serialized: subsequent calls wait until the streaming response of the active call completes.

3. Structured Digital Ledger Parser Fallback (src/llm.py & src/ledger.py)

  • The Issue: When llama_cpp timed out during a ledger transaction parse query, it fell back to the mock chat backend. This returned a generic Hindi welcome message instead of valid JSON, crashing the ledger table.
  • The Fix: Configured generate_mock to intercept structured extraction requests (checking for "precise data extractor" or "json format" system prompts) and parse fields like item, quantity, price, and type using regex rules to return a valid JSON payload.

4. "Start from Scratch" Reset Feature (src/db.py, app.py, assets/index.html)

Added a global reset capability to clear user registrations and start clean:

  • Database Backend: Implemented db.clear_all_data() to clear profile, calendars, tasks, and ledger tables. Deletes local JSON/CSV backup files to prevent auto-restoring data.
  • FastAPI Server: Exposed a new @app.post("/api/reset") endpoint.
  • Frontend UI: Added a localized 🔄 Reset / सब साफ करें button inside header controls, resetting state, clearing chat views, and displaying the onboarding panel.

5. Mobile Responsive UI CSS (assets/index.html)

  • Added media queries (@media (max-width: 600px)) to adapt layouts for mobile views.
  • Stacks header controls vertically, adjusts chat message widths to 95%, downsizes font/card metrics, and scales buttons/inputs for touch layouts.

6. Proactive Nudge Markdown Parser (assets/index.html)

  • Fixed rendering of raw markdown asterisks (**) inside the seasonal nudge banner by routing the API response through the client-side parseMarkdown() function.

7. SQLite Database Migration & Cleanups

  • Migrated local storage from JSON/CSV files to a native SQLite database (kisan.db), ensuring reliable offline persistence.
  • Added support for individual transaction deletions and CSV exports.

🛠️ Verification & Tests

All 6 core component unit tests pass successfully:

.venv\Scripts\python -m pytest
  • Output: 6 passed in 8.85s