File size: 8,414 Bytes
6125d04
 
5d36f24
6125d04
5d36f24
6125d04
 
 
 
5d36f24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
---
title: PixelPilotAI
emoji: πŸ“·
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
---

# Photo Editing Recommendation Agent

Recommend global photo edits by retrieving similar expert-edited examples (MIT–Adobe FiveK), aggregating Expert A recipes, and applying them deterministically. This repo is structured so **dataset β†’ embeddings β†’ vector DB** and the **inference API + LLM** can be developed in parallel and merged cleanly.

## Project layout (merge-ready)

```
PhotoEditor/
β”œβ”€β”€ .env                    # Copy from .env.example; set FIVEK_SUBSET_SIZE, Azure Search, etc.
β”œβ”€β”€ .env.example
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ photo_editor/            # Core package (shared by pipeline and future API)
β”‚   β”œβ”€β”€ config/              # Settings from env (paths, Azure, subset size)
β”‚   β”œβ”€β”€ dataset/             # FiveK paths, subset selection (filesAdobe.txt)
β”‚   β”œβ”€β”€ lrcat/               # Lightroom catalog: Expert A recipe extraction
β”‚   β”œβ”€β”€ images/              # DNG β†’ RGB (rawpy, neutral development)
β”‚   β”œβ”€β”€ embeddings/          # CLIP image embeddings (index + query)
β”‚   └── vector_store/        # Azure AI Search index (upload + search)
β”œβ”€β”€ scripts/
β”‚   └── build_vector_index.py   # Build vector index: subset β†’ embed β†’ push to Azure
β”œβ”€β”€ fivek_dataset/           # MIT–Adobe FiveK (file lists, raw_photos/, fivek.lrcat)
β”œβ”€β”€ LLM.py                   # Existing Azure GPT-4o explanation layer (to be wired to RAG)
└── api/                     # (Future) FastAPI: /analyze-image, /apply-edits, /edit-and-explain
```

- **Inference merge**: The API will use `photo_editor.vector_store.AzureSearchVectorStore` for retrieval, `photo_editor.embeddings` for query embedding, and `LLM.py` (or a moved `photo_editor.llm`) for explanations. Apply-edits will use a separate editing engine (OpenCV/Pillow) consuming `EditRecipe` from `photo_editor.lrcat.schema`.

## Dataset β†’ Vector DB (this slice)

1. **Subset**: First `FIVEK_SUBSET_SIZE` images from `fivek_dataset/filesAdobe.txt` (default 500; set in `.env`).
2. **Edits**: Expert A only; recipes read from `fivek.lrcat` (virtual copy "Copy 1").
3. **Embeddings**: Original DNG β†’ neutral development β†’ RGB β†’ CLIP (`openai/clip-vit-base-patch32`).
4. **Vector DB**: Azure AI Search index (created if missing); each document = `id`, `image_id`, `embedding`, `recipe` (JSON).

### Setup

```bash
cp .env.example .env
# Edit .env: FIVEK_SUBSET_SIZE (e.g. 500), AZURE_SEARCH_*, optional paths
pip install -r requirements.txt
```

### Build the index

From the project root:

```bash
PYTHONPATH=. python scripts/build_vector_index.py
```

- Requires the FiveK `raw_photos` folder (DNGs + `fivek.lrcat`) under `fivek_dataset/`.
- If Azure Search is not configured in `.env`, the script still runs and skips upload (prints a reminder).

## How to run things

All commands below assume you are in the **project root** (`PhotoEditor/`) and have:

- created and edited `.env` (see config table below), and
- installed dependencies:

```bash
pip install -r requirements.txt
```

## Deploy (Streamlit Cloud + Hugging Face Spaces)

For cloud deploy, keep the repo minimal and include only runtime files:

- `app.py`
- `photo_editor/`
- `requirements.txt`
- `.streamlit/config.toml`
- `.env.example` (template only, no secrets)

Do not commit local artifacts or large datasets (`fivek_dataset/`, `renders/`, generated images/html/json, `.env`).

### Streamlit Community Cloud

1. Push this repo to GitHub.
2. In Streamlit Cloud, create a new app from the repo.
3. Set the app file path to `app.py`.
4. Add required secrets in the app settings (same keys as in `.env.example`, e.g. `AZURE_SEARCH_*`, `AZURE_OPENAI_*`).
5. Deploy.

### Hugging Face Spaces (Streamlit SDK)

1. Create a new Space and choose **Streamlit** SDK.
2. Point it to this repository (or push these files to the Space repo).
3. Ensure `app.py` is at repo root and `requirements.txt` is present.
4. Add secrets in Space Settings (same variables as `.env.example`).
5. Launch the Space.

Optional automation: sync supported secrets from local `.env` directly to your Space:

```bash
pip install huggingface_hub
HF_TOKEN=hf_xxx python scripts/sync_hf_secrets.py --space-id <username/space-name>
```

### Hugging Face Spaces (Docker SDK)

This repo now includes a production-ready `Dockerfile` that serves the app on port `7860`.

1. Create a new Space and choose **Docker** SDK.
2. Push this repository to that Space.
3. In Space Settings, add secrets (or sync them later with `scripts/sync_hf_secrets.py`).
4. Build and launch the Space.

Local Docker test:

```bash
docker build -t lumigrade-ai .
docker run --rm -p 7860:7860 --env-file .env lumigrade-ai
```

### 1. Run the Streamlit UI (full app)

Interactive app to upload an image (JPEG/PNG) or point to a DNG on disk, then run the full pipeline and see **original vs edited** plus the suggested edit parameters.

```bash
streamlit run app.py
```

This will:
- Check Azure Search + Azure OpenAI config from `.env`.
- For each run: retrieve similar experts β†’ call LLM for summary + suggested edits β†’ apply edits (locally or via external API) β†’ show before/after.

### 2. Run the full pipeline from the terminal

Run the same pipeline as the UI, but from the CLI for a single image:

```bash
python scripts/run_pipeline.py <image_path> [--out output.jpg] [--api] [-v]
```

Examples:

```bash
# Run pipeline locally, save to result.jpg, print summary + suggested edits
python scripts/run_pipeline.py photo.jpg --out result.jpg -v

# Run pipeline but use an external editing API (requires EDITING_API_URL in .env)
python scripts/run_pipeline.py photo.jpg --out result.jpg --api -v
```

What `-v` prints:
- πŸ“‹ **Summary** of what the LLM thinks should be done.
- πŸ“ **Suggested edits**: the numeric recipe (exposure, contrast, temp, etc.) coming from Azure OpenAI for that image.
- πŸ“Ž **Expert used**: which FiveK expert image/recipe was used as reference.

### 3. Just retrieve similar experts (no LLM / no edits)

If you only want to see which FiveK images are closest to a given photo and inspect their stored recipes:

```bash
python scripts/query_similar.py <image_path> [--top-k 50] [--top-n 5]
```

Examples:

```bash
# Show the best 5 expert matches (default top-k=50 search space)
python scripts/query_similar.py photo.jpg --top-n 5

# Show only the single best match
python scripts/query_similar.py photo.jpg --top-n 1
```

Output:
- Ranks (`1.`, `2.`, …), image_ids, rerank scores.
- The stored **Expert A recipe** JSON for each match.

### 4. Get the exact Expert A recipe for a FiveK image

Given a FiveK `image_id` (with or without extension), extract the Expert A recipe directly from the Lightroom catalog:

```bash
python scripts/get_recipe_for_image.py <image_name> [-o recipe.json]
```

Examples:

```bash
# Print the recipe as JSON
python scripts/get_recipe_for_image.py a0001-jmac_DSC1459

# Save the recipe to a file
python scripts/get_recipe_for_image.py a0001-jmac_DSC1459 -o my_recipe.json
```

### 5. Apply a custom (LLM) recipe to a FiveK image

If you already have a JSON recipe (for example, something you crafted or got from the LLM) and want to apply it to a FiveK RAW image using the same rendering pipeline:

```bash
python scripts/apply_llm_recipe.py <image_id> <recipe.json> [--out path.jpg]
```

Example:

```bash
python scripts/apply_llm_recipe.py a0059-JI2E5556 llm_recipe_a0059.json --out renders/a0059-JI2E5556_LLM.jpg
```

This will:
- Load the DNG for `<image_id>`.
- Use `dng_to_rgb_normalized` to bake in exposure/brightness from the recipe.
- Apply the rest of the recipe (contrast, temperature, etc.) on top of the original Expert A baseline.
- Save the rendered JPEG.

## Config (.env)

| Variable | Description |
|--------|-------------|
| `FIVEK_SUBSET_SIZE` | Number of images to index (default 500). |
| `FIVEK_LRCAT_PATH` | Path to `fivek.lrcat` (default: `fivek_dataset/raw_photos/fivek.lrcat`). |
| `FIVEK_RAW_PHOTOS_DIR` | Root of range folders (e.g. `HQa1to700`, …). |
| `AZURE_SEARCH_ENDPOINT` | Azure AI Search endpoint URL. |
| `AZURE_SEARCH_KEY` | Azure AI Search admin key. |
| `AZURE_SEARCH_INDEX_NAME` | Index name (default `fivek-vectors`). |

## License / data

See `fivek_dataset/LICENSE.txt` and related notices for the MIT–Adobe FiveK dataset.