conversational-agent / documentation.md
samuelsaettler
Align README + documentation.md with reference template
3682b71

Documentation

Apartment Predictor (Saved Regression Model + LLM Workflow)

This file documents what was built, tested, and learned in this exercise. It follows the structure of the reference template from zhaw-iwi/ai-applications-prediction-and-nlp/documentation.md.


1. Project Summary

Short description of your app:

The app turns a free-text German apartment wish (e.g. "Ich suche eine 3.5-Zimmer-Wohnung mit etwa 85 m² in Winterthur.") into an estimated monthly rent in CHF for the Canton of Zürich. An OpenAI LLM extracts the structured fields, a saved scikit-learn GradientBoostingRegressor predicts the rent, and a second LLM call returns a short German explanation including one uncertainty note. The app is deployed as a Gradio Space on Hugging Face.


2. Files Used

File Purpose
app.py Final deployable Gradio app (LLM → model → LLM pipeline)
model_gbm.pkl Saved scikit-learn GradientBoostingRegressor (12 features)
municipality_lookup.csv Zürich municipality features used for prediction
requirements.txt Python dependencies
README.md Hugging Face Space metadata + project overview
documentation.md Written documentation for the submission

3. Numeric Prediction Part

3.1 Reused Model

Which saved model did you use? model_gbm.pkl – the GradientBoostingRegressor trained earlier in ai-applications/end-of-module-block-1/train_model.ipynb (5-fold CV R² ≈ 0.73, RMSE ≈ 559 CHF).

What does the model predict? The monthly gross rent in CHF for an apartment located in a municipality of the Canton of Zürich.

Which input features are used for prediction?

The model uses 12 features in this exact order:

  1. rooms
  2. area (m²)
  3. pop – municipality population
  4. pop_dens – population density
  5. frg_pct – percentage of foreign residents
  6. emp – number of employees
  7. tax_income – average taxable income (CHF)
  8. room_per_m2 – engineered: area / rooms
  9. luxurious – binary flag
  10. furnished – binary flag
  11. zurich_city – 1 if municipality is the City of Zürich
  12. distance_to_zurich_center – Haversine distance to Zürich centre (km)

3.2 Prediction Logic

The LLM returns rooms, area_m2, town, plus the optional flags luxurious and furnished. The Python function predict_apartment_price looks up the municipality row in municipality_lookup.csv to pull the BFS socioeconomic features (pop, pop_dens, frg_pct, emp, tax_income) and the precomputed zurich_city / distance_to_zurich_center columns. room_per_m2 is computed on the fly. The 12-column DataFrame is passed to model.predict(...) and the result is rounded to the nearest CHF.


4. LLM Extraction Part

4.1 Goal

Convert a free-form German sentence into a strict JSON object containing the values the regression model needs (rooms, area_m2, town) plus two optional binary flags (luxurious, furnished).

4.2 Prompt Design

  • System prompt in German, naming the role ("Du bist ein Extraktionshelfer für eine Schweizer Wohnungs-App.").
  • Strict JSON only is required – no Markdown, no explanation.
  • Required keys are spelled out exactly: rooms, area_m2, town.
  • Optional keys with default false: luxurious, furnished.
  • The user prompt is the raw German wish.
  • The OpenAI call uses response_format={"type": "json_object"} and temperature=0 so the output is deterministic and parseable.
Du bist ein Extraktionshelfer für eine Schweizer Wohnungs-App.
Lies den deutschen Text und gib AUSSCHLIESSLICH ein JSON-Objekt zurück.

Pflichtfelder:
- "rooms" (Zahl, z.B. 3.5)
- "area_m2" (Zahl in Quadratmetern, z.B. 85)
- "town" (Schweizer Gemeindename im Kanton Zürich, z.B. "Winterthur")

Optionale Felder (sonst false):
- "luxurious"
- "furnished"

4.3 Expected Output Format

{"rooms": 3.5, "area_m2": 85, "town": "Winterthur", "luxurious": false, "furnished": false}

4.4 Validation

parse_json_response enforces three checks before any value is used:

  1. The response is non-empty.
  2. json.loads succeeds (otherwise the raw text is shown in the error).
  3. All required keys are present.

extract_preferences then verifies that rooms and area_m2 are not None, that town is non-empty, and that match_town resolves the town to a canonical bfs_name (case-insensitive exact match first, then a substring match against the BFS list). Any failure raises a ValueError that surfaces in the German error message in the UI – there is no silent regex fallback.


5. LLM Explanation Part

5.1 Goal

Produce a short, plain German explanation of the model's prediction. The LLM must not recompute the price – it only describes the result the regression model already produced.

5.2 Prompt Design

  • System prompt tells the model it is explaining a rent estimate from a machine-learning model, in German.
  • The user message contains a JSON payload with the structured preferences and the predicted rent in CHF.
  • The model is instructed to return JSON with one key, answer, containing 2–4 short German sentences.
  • The answer must reference the user's rooms, area, and town and must include exactly one uncertainty / limitation note (condition, micro location, floor, year of renovation, …).
  • No Markdown formatting.

5.3 Expected Output Format

{"answer": "Für eine 3.5-Zimmer-Wohnung mit 85 m² in Winterthur schätzt das Modell rund 2100 CHF pro Monat. Die Schätzung basiert auf Wohnfläche und Ortsmerkmalen wie Steuerkraft und Distanz zur Stadt Zürich. Eine Unsicherheit ist, dass Zustand und Stockwerk nicht im Modell enthalten sind."}

The answer string is the text shown in the Erklärung (LLM) textbox.


6. End-to-End Pipeline

  1. The user enters a German apartment description in the textbox.
  2. extract_preferences calls the LLM and returns a validated dict {rooms, area_m2, town, luxurious, furnished}.
  3. Python validates the values with parse_json_response and match_town – any failure raises a clear German error.
  4. predict_apartment_price joins the BFS lookup, builds the 12-feature row, and calls model.predict(...).
  5. generate_explanation calls the LLM again with the preferences and the prediction; the JSON answer field is extracted.
  6. The Gradio app returns the structured preferences (JSON), the rounded CHF prediction, and the explanation text.

If any step fails, the error message is shown in the Erklärung field and the prediction is left empty – nothing is silently filled in.


7. Test Cases

# Test Input Extracted Output Correct? Prediction Returned? Explanation Returned? Notes
1 Ich suche eine 3.5-Zimmer-Wohnung mit 85 m2 in Winterthur. Yes Yes (~CHF 2,100) Yes Baseline case from the assignment
2 Ich brauche eine möblierte 2-Zimmer-Wohnung mit 55 m2 in Kloten. Yes (furnished=true) Yes Yes Tests optional flag detection
3 Ich hätte gerne eine luxuriöse 4.5-Zimmer-Wohnung mit 140 m2 in Küsnacht (ZH). Yes (luxurious=true) Yes (~CHF 4,500) Yes Tests luxury flag and a town with parentheses
4 Eine 5-Zimmer-Wohnung mit 130 m2 in Zürich wäre ideal. Yes Yes Yes Tests zurich_city=1 path
5 Ich suche etwas in Bern. Pipeline raises a German error No Error message shown Out-of-canton town → friendly failure, no silent fallback

Local sanity check (calling predict_apartment_price directly, no LLM):

3.5 rooms / 85 m²  / Winterthur            → CHF 2,103
4.5 rooms / 140 m² / Küsnacht (ZH) luxury  → CHF 4,462

8. Errors and Problems

Problem: First test runs returned a 132-byte model_gbm.pkl and pickle.load failed. Cause: The copy of the file in apartment-price-prediction/ was a Git LFS pointer, not the real model. Fix: Use the actual 1.4 MB model from ai-applications/end-of-module-block-1/model_gbm.pkl.

Problem: First push to Hugging Face was rejected with "contains binary files. Please use Xet to store binary files." Cause: model_gbm.pkl was committed as a regular blob and the HF pre-receive hook enforces Xet/LFS for .pkl files. Fix: Reset the commit, upload the model with hf upload --repo-type space saettsam/conversational-agent model_gbm.pkl model_gbm.pkl (uses Xet), pull the new commit, then push the rest of the files normally.

Problem: Town names like Küsnacht (ZH) or Zürich (umlaut) did not match user input. Cause: Strict, case-sensitive equality on the BFS list. Fix: match_town lower-cases both sides and falls back to a substring match against the canonical bfs_name list.

Problem: Missing OPENAI_API_KEY on the Space crashed the app on the first user interaction with an opaque traceback. Cause: The OpenAI client was being created at import time. Fix: Lazy get_openai_client() raises a clear German error message that surfaces directly in the UI textbox.


9. Deployment Notes

9.1 Files included

  • app.py
  • model_gbm.pkl (uploaded via Xet)
  • municipality_lookup.csv
  • requirements.txt
  • README.md
  • documentation.md
  • .gitattributes

9.2 Secrets / Environment Variables

Configured in Settings → Variables and secrets of the Space:

  • OPENAI_API_KEY (required)
  • OPENAI_MODEL (optional, defaults to gpt-4o-mini)

9.3 Deployment Result

The Space builds with the standard Gradio template. The model file (~1.4 MB) lives in Xet storage and loads on cold start. After the secret is set, end-to-end latency is roughly 0.5 s for extraction, negligible for the local model prediction, and ~1 s for the explanation – about 1.5–2 s per German request in total.

9.4 Screenshots

Two screenshots from the running app, each showing a different German input, the Extrahierte Eingaben (LLM) JSON, the Geschätzte Monatsmiete (CHF) number, and the Erklärung (LLM) text:

Beispiel 1

Beispiel 1: A first German apartment wish is entered. The LLM extracts the structured JSON (rooms, area_m2, town, plus the optional luxurious / furnished flags), the GradientBoostingRegressor returns a CHF rent estimate, and the second LLM call produces the short German explanation visible in the Erklärung (LLM) textbox – including one uncertainty note about features not contained in the model.

Beispiel 2

Beispiel 2: A second German apartment wish with different rooms, area, and town is entered. Again the extracted JSON, the predicted monthly rent, and the German explanation are all visible at the same time, demonstrating that the end-to-end pipeline (LLM extraction → model prediction → LLM explanation) works for multiple inputs.


10. Reflection

Combining a regression model with an LLM gives a friendly natural-language front end without giving up the deterministic numerics – the model still owns the price. The system is most fragile when the user names a town outside the canton or omits a required value; strict JSON mode plus an explicit match_town check keeps those failures visible instead of producing a confidently wrong prediction. German input matters because the BFS dataset uses Swiss spellings (Zürich, Küsnacht (ZH)) that an English prompt drifts away from. The biggest missing inputs are condition, year of renovation, floor / elevator, and balcony – features that humans weigh heavily but the training data did not capture. Next iteration: add confidence intervals from a quantile regressor and an optional clarifying question when the LLM returns null for area_m2 or rooms.


11. Responsible Use Note

The prediction is a rough indication, not a market quote. The model was trained on a snapshot of public listings and only sees twelve structured features – condition, micro location, balcony, floor, elevator and many other rent drivers are not represented. The LLM may also misread the user text (e.g. confuse "etwa 85 m²" with another number); that is why every prediction is shown together with the extracted JSON, so the user can verify what the model actually saw. The app is intended for educational and exploratory use only and must not be used as the sole basis for any rental decision.