VendorLens / PROMPTS_USED.md
Vishwajeet07's picture
initial
b8f6c99

Prompts Used β€” Development Log

Questions and prompts I used during development, mostly to sanity-check approaches or look up specific API behaviour. Architecture decisions and debugging are in AI_NOTES.md.


Architecture

Question:

fastapi background tasks vs celery for a 30 second async job β€” do i need celery or is backgroundtasks fine for a single instance?

Went with built-in BackgroundTasks. No Redis, no worker process, keeps Docker to two containers. Only limitation is tasks don't survive a server restart β€” acceptable for this use case.


LLM Pipeline

Question:

if i ground a gemini call with scraped webpage text will it use the actual text or still make things up about pricing

Testing confirmed: grounding with real scraped content makes a clear difference on pricing accuracy. Two-pass approach (identify β†’ scrape β†’ compare) is the right design.

Follow-up:

gemini keeps wrapping json output in json markdown fences even when i tell it not to, how to extract reliably

Regex strip before JSON parse, with a fallback to try raw parse first. Added to the _extract_json() utility in llm.py.


Scraper

Question:

beautifulsoup β€” how to target specific sections of a page by class/id keyword instead of getting everything

Check class and id attributes of each element against a keyword list (pricing, plan, cost, feature, etc.), collect matching sections first, fall back to full page text if none found. Tested on Mailgun, Supabase, Pinecone β€” the targeted approach cuts a lot of noise.


SQLAlchemy

Question:

sqlalchemy json column β€” i'm doing dict update in place then commit() but the change doesn't save, assignment works though

SQLAlchemy mutation tracking only works at the column level, not inside nested objects. In-place dict changes are invisible to it. Fix: call flag_modified(instance, "field_name") before committing. This was the actual bug in the exclude/include feature β€” looked like it worked in memory, gone on next request.


Docker

Question:

docker-compose depends_on β€” how to make frontend actually wait for backend to be ready not just started

depends_on: condition: service_healthy + a healthcheck on the backend container. Plain depends_on only waits for the container process, not the app. Without this, nginx starts and immediately tries to proxy requests to a FastAPI that's still booting.


Frontend Polling

Question:

react setinterval polling β€” start on mount, stop when status is done/error, cleanup on unmount

useEffect + setInterval with cleanup. Store interval ID in a ref, clear in the effect cleanup and whenever status hits a terminal state. No library needed for something this simple.


PDF Export

Question:

jspdf is huge, can i load it only when user clicks export instead of on page load

Dynamic import() inside the async click handler β€” const { default: jsPDF } = await import('jspdf'). Downloads on demand, not upfront. Combined with Vite manual chunk splitting (react, router, icons into separate chunks), main bundle went from ~220KB to ~50KB.


Gemini API Key Verification

Before wiring up the full pipeline, verified the key works with a raw HTTP call:

import urllib.request, json

url = "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=YOUR_KEY"
body = json.dumps({"contents": [{"parts": [{"text": "Reply with exactly: GEMINI_OK"}]}]}).encode()
req = urllib.request.Request(url, data=body, headers={"Content-Type": "application/json"})
print(urllib.request.urlopen(req).read())

Got back GEMINI_OK. Then moved on to wiring up the SDK properly.


Notes

  • All of the above were clarification questions or API lookups, not code generation for core logic
  • The two-phase pipeline design, scraper targeting approach, and SQLAlchemy flag_modified fix all came from testing and debugging β€” not from any of these prompts
  • Gemini JSON wrapping issue was discovered from actual API responses during testing, not anticipated upfront