roam / README.md
AarnavNoble's picture
Upload README.md with huggingface_hub
1f4cee2 verified
metadata
title: Roam
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false

roam

An AI-powered travel itinerary generator. Give it a destination, trip duration, transport mode, and your interests β€” it returns a day-by-day itinerary with stops ordered to minimize travel time.

Most "AI" travel apps are LLM wrappers: prompt GPT, display output. Roam builds the actual ML stack underneath.


How it works

User Input (destination, days, transport, goals)
         β”‚
         β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚  RAG Retrieval  β”‚  ← FAISS vector search over scraped Wikivoyage + Reddit content
 β”‚                 β”‚     sentence-transformers (all-MiniLM-L6-v2) embeddings
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚   POI Fetcher   β”‚  ← Overpass API (OpenStreetMap) β€” local places only, chains filtered
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Preference      β”‚  ← LightGBM LambdaRank model trained on (goal, POI, relevance) triplets
 β”‚ Ranker          β”‚     8 features: semantic similarity + category match signals
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ VRP Optimizer   β”‚  ← OR-Tools TSP with time windows β€” minimizes daily travel time
 β”‚                 β”‚     Assigns POIs across days, respects 10hr daily budget
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚  LLM Synthesis  β”‚  ← Groq (Llama 3.3 70B) generates natural language itinerary
 β”‚                 β”‚     from optimized route + retrieved travel context
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
  Mobile App (React Native + MapLibre)

ML Components

1. RAG Pipeline (backend/ml/rag/)

Retrieval-Augmented Generation over real travel content β€” not just prompting an LLM blind.

  • Scrapes Wikivoyage travel guides + Reddit trip reports per destination
  • Chunks text into overlapping 512-word windows
  • Embeds with sentence-transformers/all-MiniLM-L6-v2 (384-dim, runs locally)
  • Stores in FAISS flat index (cosine similarity via inner product on normalized vectors)
  • At query time, retrieves top-5 semantically relevant chunks to ground the LLM

2. Learning-to-Rank (backend/ml/ranker/)

A trained model that scores POIs against user goals β€” not keyword matching.

  • Features: cosine similarity between goal embedding and POI description, category match signals (food/nature/history/nightlife), name specificity, tag richness
  • Model: LightGBM with lambdarank objective β€” the same ranking approach used in production search engines (NDCG-optimized)
  • Training data: synthetic (goal, POI list, relevance scores) scenarios covering 5 travel styles
  • Feedback hook: stubbed for online learning β€” thumbs up/down signals can trigger incremental retraining

3. VRP Route Optimizer (backend/ml/optimizer/)

Formulates itinerary generation as a constrained Vehicle Routing Problem β€” not just sorting by distance.

  • Builds NxN travel time matrix (OpenRouteService API, Haversine fallback)
  • Solves TSP per day using OR-Tools with time windows (opening hours) and visit duration constraints
  • Greedy day assignment: spreads ranked POIs across trip days respecting 10-hour daily budget
  • Returns estimated arrival times per stop

Stack

Layer Tech
Mobile React Native (Expo) + MapLibre
Backend Python + FastAPI
Embeddings sentence-transformers (all-MiniLM-L6-v2)
Vector Store FAISS
Ranking LightGBM LambdaRank
Route Optimization Google OR-Tools (TSP/VRP)
LLM Groq API (Llama 3.3 70B)
POI Data OpenStreetMap / Overpass API
Routing OpenRouteService
Geocoding Nominatim

Everything except Groq is free and open source. Groq has a free tier.


Project Structure

roam/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── routes.py           # FastAPI endpoints β€” wires full pipeline
β”‚   β”œβ”€β”€ ml/
β”‚   β”‚   β”œβ”€β”€ rag/
β”‚   β”‚   β”‚   β”œβ”€β”€ scraper.py      # Wikivoyage + Reddit scraper
β”‚   β”‚   β”‚   β”œβ”€β”€ chunker.py      # Overlapping text chunker
β”‚   β”‚   β”‚   β”œβ”€β”€ embedder.py     # sentence-transformers encoding
β”‚   β”‚   β”‚   β”œβ”€β”€ vector_store.py # FAISS index build/save/load
β”‚   β”‚   β”‚   β”œβ”€β”€ retriever.py    # Query-time retrieval
β”‚   β”‚   β”‚   └── build_pipeline.py # One-shot index builder
β”‚   β”‚   β”œβ”€β”€ ranker/
β”‚   β”‚   β”‚   β”œβ”€β”€ features.py     # Feature extraction (embeddings + metadata)
β”‚   β”‚   β”‚   β”œβ”€β”€ model.py        # LightGBM LambdaRank model
β”‚   β”‚   β”‚   β”œβ”€β”€ trainer.py      # Training on synthetic data
β”‚   β”‚   β”‚   └── scorer.py       # Runtime scoring + feedback hook
β”‚   β”‚   └── optimizer/
β”‚   β”‚       β”œβ”€β”€ distance.py     # Travel time matrix (ORS + Haversine fallback)
β”‚   β”‚       β”œβ”€β”€ vrp.py          # OR-Tools TSP solver with time windows
β”‚   β”‚       └── scheduler.py    # Day assignment + route optimization
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ overpass.py         # OSM POI fetcher (chains filtered)
β”‚   β”‚   β”œβ”€β”€ nominatim.py        # Geocoding
β”‚   β”‚   └── groq_client.py      # LLM synthesis
β”‚   └── main.py
β”œβ”€β”€ mobile/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ index.tsx           # Home screen (trip input)
β”‚   β”‚   └── itinerary.tsx       # Results screen (list + map view)
β”‚   └── services/
β”‚       └── api.ts              # Typed API client
└── data/                       # FAISS index + trained model (gitignored)

Setup

Backend

cd roam
python3 -m venv .venv && source .venv/bin/activate
pip install -r backend/requirements.txt

# Add your Groq API key (free at console.groq.com)
cp backend/.env.example backend/.env
# Edit backend/.env and set GROQ_API_KEY

# Build RAG index (scrapes + embeds ~8 cities, takes ~5 min)
python -m backend.ml.rag.build_pipeline

# Train the ranker
python -m backend.ml.ranker.trainer

# Start the API
uvicorn backend.main:app --reload

Mobile

cd mobile
npm install
cp .env.example .env
npx expo start

Scan the QR code with Expo Go (iOS / Android). Phone and Mac must be on the same WiFi.


API

POST /api/itinerary

{
  "destination": "Tokyo",
  "days": 3,
  "transport": "walking",
  "goals": ["food", "history", "hidden gems"]
}

Returns a structured day-by-day itinerary with stops, arrival times, descriptions, and coordinates.

POST /api/feedback

{
  "poi_id": 12345,
  "relevant": true
}

Logs positive/negative signals for future ranker retraining.