File size: 4,784 Bytes
8696553
 
 
5dc68a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8696553
 
97c8e04
 
5dc68a0
 
 
 
 
 
97c8e04
 
5dc68a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97c8e04
5dc68a0
 
 
 
97c8e04
5dc68a0
 
97c8e04
5dc68a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8696553
 
5dc68a0
 
 
 
8696553
 
5dc68a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# Lars Urlaubs-Deals: Technical Documentation

## Overview
This project is an AI-powered vacation deal finder specializing in dog-friendly accommodations across multiple platforms (Airbnb, Booking.com).

It is designed for:
- resilient scraping under rate limiting,
- unified scoring and filtering,
- transparent run diagnostics via observability and health data,
- practical dashboard usability for day-to-day deal checks.

## Recent Improvements (2026-02)
The latest enhancement cycle added:
- **Run-level observability** with run IDs, KPI counters, and structured events.
- **Central deal validation** before ranking.
- **Robust price alerts** (dedupe, cooldown, per-deal threshold overrides).
- **CLI scheduler mode** for periodic background searches.
- **Dashboard UX upgrades** (client-side filters, better empty/error states, price trend chips).
- **Currency normalization to EUR** for fair cross-market ranking.

## Core Features

### 1. Smart Scrapers (Multi-Strategy)
Both Airbnb and Booking.com scrapers follow a tiered strategy:
- **Strategy 1: Local Curl/HTTP** (fastest, cheapest)
- **Strategy 2: Firecrawl Cloud** (reliable rendered fallback)
- **Strategy 3: Static fallback data** (keeps UI functional if everything else fails)

Strategy attempts are instrumented with source/strategy duration and success metrics.

### 2. Rate Limit Bypass
- **User-Agent rotation**
- **Adaptive delays** that increase under pressure
- **Exponential backoff** for repeated throttling
- **Optional session warming** for more realistic request patterns

### 3. Central Validation Pipeline
All raw deals are validated before ranking:
- required fields (name, location, source, url)
- numeric sanity checks (price/rating/reviews)
- budget boundaries
- pet-friendly enforcement when pets are requested

Validation output is returned in API/agent results (`valid_count`, `rejected_count`, reasons).

### 4. Observability & KPI Tracking
A lightweight observability layer tracks each search run:
- unique run ID
- lifecycle events (`run_started`, source cache hits/misses, errors, run_finished)
- per-run counters (cache hits, misses, valid deals, triggered alerts, etc.)
- run summaries retained for health diagnostics

`/health` includes an observability snapshot with active/recent runs.

### 5. Price Alert System
Price alerts are persisted and now include robustness controls:
- configurable drop threshold (global + per-deal override)
- dedupe window for repeated identical updates
- cooldown window to suppress duplicate alerts at same price
- capped history size per property

### 6. Intelligent Caching
- Local JSON cache (`.search_cache.json`) with TTL
- repeated searches with same parameters return quickly
- cache metrics included in observability

### 7. Deal Ranking and Currency Normalization
Deals are scored by price/rating/reviews with pet/weather multipliers.

All ranking prices are normalized to **EUR** using built-in FX rates (or optional custom per-deal `fx_rate_to_eur`), while preserving original currency/price in output metadata.

### 8. Dashboard UX
The web dashboard includes:
- source tabs and sorting modes,
- client-side filters (minimum rating, max EUR/night, pet-only),
- explicit empty-state messaging ("no source results" vs "filtered out"),
- improved fetch error handling,
- price trend badges when previous price context is available.

## Scheduler Mode (CLI)
You can run periodic searches from CLI:

```bash
python main.py \
  --cities "Amsterdam,Rotterdam" \
  --checkin 2026-03-01 \
  --checkout 2026-03-05 \
  --schedule-minutes 30 \
  --max-runs 6
```

- `--schedule-minutes 0` keeps one-shot behavior (default).
- `--max-runs 0` means unlimited scheduled cycles.

## Tech Stack
- **Backend:** Python, FastAPI, Uvicorn
- **Scraping:** httpx, BeautifulSoup4, Firecrawl API
- **Frontend:** Responsive HTML/JS dashboard (Tailwind-style utility classes)
- **Persistence:** Local JSON files for cache and alerts

## Testing
Primary regression coverage for the new features includes:
- `tests/test_price_alerts.py` (dedupe/cooldown/override + agent integration)
- `tests/test_agent_validation.py` (pet filter + validation counters)
- `tests/test_currency_normalization.py` (EUR normalization + custom FX override)
- `tests/test_scheduler_cli.py` (scheduler CLI argument parsing)
- `tests/test_caching.py` (cache behavior still valid)

Example run:

```bash
PYTHONPATH=. pytest -q \
  tests/test_price_alerts.py \
  tests/test_agent_validation.py \
  tests/test_currency_normalization.py \
  tests/test_scheduler_cli.py
```

## Deployment Notes
- Local web mode: `uvicorn api:app --reload`
- Health check endpoint: `/health`
- Search endpoint: `/search`

## Repository / Distribution
- Main source repository and deployment references remain unchanged.