Spaces:

arjun10g
/

RAG-PSYCH

Sleeping

arjun10g commited on 26 days ago

Commit

d98125d

1 Parent(s): 863c0ab

Tighten per-IP rate limits to keep daily spend per visitor under $1

Replace the loose '30/minute' cap with a layered '5/minute;20/hour;30/day'
on both /query and /ui/query. Same key (remote IP) on all three windows;
slowapi enforces them independently and the most restrictive wins.

Worst-case spend per IP per day: 30 queries x ~$0.011 (max_tokens=2048
on claude-haiku-4-5) = ~$0.33. Comfortably under the $1 ceiling and
still leaves room for a legitimate evaluator to try a handful of queries
in a single sitting.

Files changed (1) hide show

api/main.py +2 -2

api/main.py CHANGED Viewed

@@ -159,7 +159,7 @@ def health() -> JSONResponse:
 @app.post("/query", response_model=QueryResponse)
-@limiter.limit("30/minute")
 def query(request: Request, body: QueryRequest) -> QueryResponse:
     """Run the RAG pipeline end-to-end. See module docstring for guarantees."""
     qhash = hash_query(body.query)
@@ -302,7 +302,7 @@ def _corpus_stats() -> dict[str, Any]:
 @app.post("/ui/query", response_class=HTMLResponse)
-@limiter.limit("30/minute")
 def ui_query(
     request: Request,
     query: str = Form(..., min_length=1, max_length=2000),

 @app.post("/query", response_model=QueryResponse)
+@limiter.limit("5/minute;20/hour;30/day")
 def query(request: Request, body: QueryRequest) -> QueryResponse:
     """Run the RAG pipeline end-to-end. See module docstring for guarantees."""
     qhash = hash_query(body.query)
 @app.post("/ui/query", response_class=HTMLResponse)
+@limiter.limit("5/minute;20/hour;30/day")
 def ui_query(
     request: Request,
     query: str = Form(..., min_length=1, max_length=2000),