---
title: PixelPilotAI
emoji: 📷
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
---

# Photo Editing Recommendation Agent

Recommend global photo edits by retrieving similar expert-edited examples (MIT–Adobe FiveK), aggregating Expert A recipes, and applying them deterministically. This repo is structured so **dataset → embeddings → vector DB** and the **inference API + LLM** can be developed in parallel and merged cleanly.

## Project layout (merge-ready)

```
PhotoEditor/
├── .env                    # Copy from .env.example; set FIVEK_SUBSET_SIZE, Azure Search, etc.
├── .env.example
├── requirements.txt
├── photo_editor/            # Core package (shared by pipeline and future API)
│   ├── config/              # Settings from env (paths, Azure, subset size)
│   ├── dataset/             # FiveK paths, subset selection (filesAdobe.txt)
│   ├── lrcat/               # Lightroom catalog: Expert A recipe extraction
│   ├── images/              # DNG → RGB (rawpy, neutral development)
│   ├── embeddings/          # CLIP image embeddings (index + query)
│   └── vector_store/        # Azure AI Search index (upload + search)
├── scripts/
│   └── build_vector_index.py   # Build vector index: subset → embed → push to Azure
├── fivek_dataset/           # MIT–Adobe FiveK (file lists, raw_photos/, fivek.lrcat)
├── LLM.py                   # Existing Azure GPT-4o explanation layer (to be wired to RAG)
└── api/                     # (Future) FastAPI: /analyze-image, /apply-edits, /edit-and-explain
```

- **Inference merge**: The API will use `photo_editor.vector_store.AzureSearchVectorStore` for retrieval, `photo_editor.embeddings` for query embedding, and `LLM.py` (or a moved `photo_editor.llm`) for explanations. Apply-edits will use a separate editing engine (OpenCV/Pillow) consuming `EditRecipe` from `photo_editor.lrcat.schema`.

## Dataset → Vector DB (this slice)

1. **Subset**: First `FIVEK_SUBSET_SIZE` images from `fivek_dataset/filesAdobe.txt` (default 500; set in `.env`).
2. **Edits**: Expert A only; recipes read from `fivek.lrcat` (virtual copy "Copy 1").
3. **Embeddings**: Original DNG → neutral development → RGB → CLIP (`openai/clip-vit-base-patch32`).
4. **Vector DB**: Azure AI Search index (created if missing); each document = `id`, `image_id`, `embedding`, `recipe` (JSON).

### Setup

```bash
cp .env.example .env
# Edit .env: FIVEK_SUBSET_SIZE (e.g. 500), AZURE_SEARCH_*, optional paths
pip install -r requirements.txt
```

### Build the index

From the project root:

```bash
PYTHONPATH=. python scripts/build_vector_index.py
```

- Requires the FiveK `raw_photos` folder (DNGs + `fivek.lrcat`) under `fivek_dataset/`.
- If Azure Search is not configured in `.env`, the script still runs and skips upload (prints a reminder).

## How to run things

All commands below assume you are in the **project root** (`PhotoEditor/`) and have:

- created and edited `.env` (see config table below), and
- installed dependencies:

```bash
pip install -r requirements.txt
```

## Deploy (Streamlit Cloud + Hugging Face Spaces)

For cloud deploy, keep the repo minimal and include only runtime files:

- `app.py`
- `photo_editor/`
- `requirements.txt`
- `.streamlit/config.toml`
- `.env.example` (template only, no secrets)

Do not commit local artifacts or large datasets (`fivek_dataset/`, `renders/`, generated images/html/json, `.env`).

### Streamlit Community Cloud

1. Push this repo to GitHub.
2. In Streamlit Cloud, create a new app from the repo.
3. Set the app file path to `app.py`.
4. Add required secrets in the app settings (same keys as in `.env.example`, e.g. `AZURE_SEARCH_*`, `AZURE_OPENAI_*`).
5. Deploy.

### Hugging Face Spaces (Streamlit SDK)

1. Create a new Space and choose **Streamlit** SDK.
2. Point it to this repository (or push these files to the Space repo).
3. Ensure `app.py` is at repo root and `requirements.txt` is present.
4. Add secrets in Space Settings (same variables as `.env.example`).
5. Launch the Space.

Optional automation: sync supported secrets from local `.env` directly to your Space:

```bash
pip install huggingface_hub
HF_TOKEN=hf_xxx python scripts/sync_hf_secrets.py --space-id <username/space-name>
```

### Hugging Face Spaces (Docker SDK)

This repo now includes a production-ready `Dockerfile` that serves the app on port `7860`.

1. Create a new Space and choose **Docker** SDK.
2. Push this repository to that Space.
3. In Space Settings, add secrets (or sync them later with `scripts/sync_hf_secrets.py`).
4. Build and launch the Space.

Local Docker test:

```bash
docker build -t lumigrade-ai .
docker run --rm -p 7860:7860 --env-file .env lumigrade-ai
```

### 1. Run the Streamlit UI (full app)

Interactive app to upload an image (JPEG/PNG) or point to a DNG on disk, then run the full pipeline and see **original vs edited** plus the suggested edit parameters.

```bash
streamlit run app.py
```

This will:
- Check Azure Search + Azure OpenAI config from `.env`.
- For each run: retrieve similar experts → call LLM for summary + suggested edits → apply edits (locally or via external API) → show before/after.

### 2. Run the full pipeline from the terminal

Run the same pipeline as the UI, but from the CLI for a single image:

```bash
python scripts/run_pipeline.py <image_path> [--out output.jpg] [--api] [-v]
```

Examples:

```bash
# Run pipeline locally, save to result.jpg, print summary + suggested edits
python scripts/run_pipeline.py photo.jpg --out result.jpg -v

# Run pipeline but use an external editing API (requires EDITING_API_URL in .env)
python scripts/run_pipeline.py photo.jpg --out result.jpg --api -v
```

What `-v` prints:
- 📋 **Summary** of what the LLM thinks should be done.
- 📐 **Suggested edits**: the numeric recipe (exposure, contrast, temp, etc.) coming from Azure OpenAI for that image.
- 📎 **Expert used**: which FiveK expert image/recipe was used as reference.

### 3. Just retrieve similar experts (no LLM / no edits)

If you only want to see which FiveK images are closest to a given photo and inspect their stored recipes:

```bash
python scripts/query_similar.py <image_path> [--top-k 50] [--top-n 5]
```

Examples:

```bash
# Show the best 5 expert matches (default top-k=50 search space)
python scripts/query_similar.py photo.jpg --top-n 5

# Show only the single best match
python scripts/query_similar.py photo.jpg --top-n 1
```

Output:
- Ranks (`1.`, `2.`, …), image_ids, rerank scores.
- The stored **Expert A recipe** JSON for each match.

### 4. Get the exact Expert A recipe for a FiveK image

Given a FiveK `image_id` (with or without extension), extract the Expert A recipe directly from the Lightroom catalog:

```bash
python scripts/get_recipe_for_image.py <image_name> [-o recipe.json]
```

Examples:

```bash
# Print the recipe as JSON
python scripts/get_recipe_for_image.py a0001-jmac_DSC1459

# Save the recipe to a file
python scripts/get_recipe_for_image.py a0001-jmac_DSC1459 -o my_recipe.json
```

### 5. Apply a custom (LLM) recipe to a FiveK image

If you already have a JSON recipe (for example, something you crafted or got from the LLM) and want to apply it to a FiveK RAW image using the same rendering pipeline:

```bash
python scripts/apply_llm_recipe.py <image_id> <recipe.json> [--out path.jpg]
```

Example:

```bash
python scripts/apply_llm_recipe.py a0059-JI2E5556 llm_recipe_a0059.json --out renders/a0059-JI2E5556_LLM.jpg
```

This will:
- Load the DNG for `<image_id>`.
- Use `dng_to_rgb_normalized` to bake in exposure/brightness from the recipe.
- Apply the rest of the recipe (contrast, temperature, etc.) on top of the original Expert A baseline.
- Save the rendered JPEG.

## Config (.env)

| Variable | Description |
|--------|-------------|
| `FIVEK_SUBSET_SIZE` | Number of images to index (default 500). |
| `FIVEK_LRCAT_PATH` | Path to `fivek.lrcat` (default: `fivek_dataset/raw_photos/fivek.lrcat`). |
| `FIVEK_RAW_PHOTOS_DIR` | Root of range folders (e.g. `HQa1to700`, …). |
| `AZURE_SEARCH_ENDPOINT` | Azure AI Search endpoint URL. |
| `AZURE_SEARCH_KEY` | Azure AI Search admin key. |
| `AZURE_SEARCH_INDEX_NAME` | Index name (default `fivek-vectors`). |

## License / data

See `fivek_dataset/LICENSE.txt` and related notices for the MIT–Adobe FiveK dataset.