FatemehT commited on
Commit
76b0572
ยท
1 Parent(s): aabc413

Configure Git LFS and remove binary files from direct git storage

Browse files
Files changed (5) hide show
  1. .gitattributes +9 -0
  2. .pre-commit-config.yaml +0 -0
  3. Dockerfile +25 -0
  4. README copy.md +300 -0
  5. requirements.txt +2 -2
.gitattributes ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ *.mp4 filter=lfs diff=lfs merge=lfs -text
2
+ normalized_outputs/*.png filter=lfs diff=lfs merge=lfs -text
3
+ *.jpeg filter=lfs diff=lfs merge=lfs -text
4
+ *.gif filter=lfs diff=lfs merge=lfs -text
5
+ *.bmp filter=lfs diff=lfs merge=lfs -text
6
+ *.tiff filter=lfs diff=lfs merge=lfs -text
7
+ *.webp filter=lfs diff=lfs merge=lfs -text
8
+ *.png filter=lfs diff=lfs merge=lfs -text
9
+ *.jpg filter=lfs diff=lfs merge=lfs -text
.pre-commit-config.yaml ADDED
File without changes
Dockerfile ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install system packages required by scientific Python stack and OpenCV
6
+ RUN apt-get update \
7
+ && apt-get install -y --no-install-recommends \
8
+ build-essential \
9
+ git \
10
+ curl \
11
+ libgl1 \
12
+ libglib2.0-0 \
13
+ && rm -rf /var/lib/apt/lists/*
14
+
15
+ ENV STREAMLIT_BROWSER_GATHER_USAGE_STATS=false \
16
+ PYTHONUNBUFFERED=1
17
+
18
+ COPY requirements.txt .
19
+ RUN pip install --no-cache-dir -r requirements.txt
20
+
21
+ COPY . .
22
+
23
+ EXPOSE 8501
24
+
25
+ CMD ["streamlit", "run", "angioPySegmentation.py", "--server.fileWatcherType", "none"]
README copy.md ADDED
@@ -0,0 +1,300 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # tomoro-evals
2
+
3
+ [![Static Badge](https://img.shields.io/badge/User%20Guide-Documentation-blue)](https://tomoro-ai.github.io/tomoro-evals/)
4
+
5
+
6
+ # How to run the project
7
+ ## Create virtual environment and activate it
8
+ ```bash
9
+ uv venv
10
+ source .venv/bin/activate
11
+ uv pip install -e .
12
+ ```
13
+
14
+ ## Database Setup (Required for ETL Pipeline)
15
+
16
+ Before running the ETL pipeline (`main()` function) in the notebooks, you need to set up PostgreSQL:
17
+
18
+ ### Prerequisites
19
+ 1. **Install PostgreSQL** (if not already installed):
20
+ ```bash
21
+ brew install postgresql@14
22
+ ```
23
+
24
+ 2. **Start PostgreSQL service**:
25
+ ```bash
26
+ brew services start postgresql@14
27
+ ```
28
+
29
+ 3. **Create database and user**:
30
+ ```bash
31
+ psql -d postgres -c "CREATE DATABASE cii;"
32
+ psql -d postgres -c "CREATE USER app WITH PASSWORD 'password';"
33
+ psql -d postgres -c "GRANT ALL PRIVILEGES ON DATABASE cii TO app;"
34
+ ```
35
+
36
+ 4. **Create database tables**:
37
+ ```bash
38
+ psql -d cii -U app -h localhost -f cronjob/customer_transactions/schema.sql
39
+ ```
40
+
41
+ 5. **Verify setup**:
42
+ ```bash
43
+ psql -d cii -U app -h localhost -c "\dt"
44
+ ```
45
+
46
+ The ETL pipeline expects:
47
+ - Database name: `cii`
48
+ - Username: `app`
49
+ - Password: `password`
50
+ - Host: `localhost`
51
+ - Port: `5432`
52
+
53
+ These settings are configured in the `DSN` variable in the notebook.
54
+
55
+ ### Alternative: Run PostgreSQL with Docker (Recommended for Isolation)
56
+ If you prefer not to install PostgreSQL locally, you can run it in a Docker container that auto-loads the schema.
57
+
58
+ #### 1. Start a fresh container
59
+ ```bash
60
+ docker rm -f pg-cii 2>/dev/null || true
61
+ docker volume rm pgdata 2>/dev/null || true
62
+
63
+ docker run -d \
64
+ --name pg-cii \
65
+ -e POSTGRES_USER=app \
66
+ -e POSTGRES_PASSWORD=password \
67
+ -e POSTGRES_DB=cii \
68
+ -p 5432:5432 \
69
+ -v pgdata:/var/lib/postgresql/data \
70
+ -v $(pwd)/cronjob/customer_transactions/schema.sql:/docker-entrypoint-initdb.d/001-schema.sql:ro \
71
+ postgres:16
72
+ ```
73
+ The `schema.sql` file is executed only the first time the named volume `pgdata` is initialized.
74
+
75
+ #### 2. Check container & logs
76
+ ```bash
77
+ docker ps --filter name=pg-cii
78
+ docker logs pg-cii | tail -n 30
79
+ ```
80
+
81
+ #### 3. Inspect tables
82
+ ```bash
83
+ docker exec -it pg-cii psql -U app -d cii -c "\dt"
84
+ ```
85
+
86
+ #### 4. Set DSN (current shell)
87
+ ```bash
88
+ export CII_PG_DSN="dbname=cii user=app password=password host=localhost port=5432"
89
+ ```
90
+ If using a notebook:
91
+ ```python
92
+ import os
93
+ os.environ["CII_PG_DSN"] = "dbname=cii user=app password=password host=localhost port=5432"
94
+ ```
95
+
96
+ #### 5. Rebuild after changing `schema.sql`
97
+ ```bash
98
+ docker rm -f pg-cii && docker volume rm pgdata && \
99
+ docker run -d --name pg-cii \
100
+ -e POSTGRES_USER=app -e POSTGRES_PASSWORD=password -e POSTGRES_DB=cii \
101
+ -p 5432:5432 \
102
+ -v pgdata:/var/lib/postgresql/data \
103
+ -v $(pwd)/cronjob/customer_transactions/schema.sql:/docker-entrypoint-initdb.d/001-schema.sql:ro \
104
+ postgres:16
105
+ ```
106
+
107
+ #### 6. Stop / start later
108
+ ```bash
109
+ docker stop pg-cii
110
+ docker start pg-cii
111
+ ```
112
+
113
+ #### 7. Apply schema manually (if needed on an existing container)
114
+ ```bash
115
+ cat cronjob/customer_transactions/schema.sql | docker exec -i pg-cii psql -U app -d cii
116
+ ```
117
+
118
+ #### 8. Simple backup / restore
119
+ ```bash
120
+ # Backup
121
+ docker exec -t pg-cii pg_dump -U app -d cii > backup.sql
122
+ # Restore (fresh volume)
123
+ docker rm -f pg-cii && docker volume rm pgdata
124
+ # start container again (see step 1, omit schema bind if restoring)
125
+ cat backup.sql | docker exec -i pg-cii psql -U app -d cii
126
+ ```
127
+
128
+ ### Using uv with Docker Postgres
129
+ All Python commands can run inside the uv-managed environment while PostgreSQL runs in Docker.
130
+ ```bash
131
+ uv sync # install dependencies
132
+ export CII_PG_DSN="dbname=cii user=app password=password host=localhost port=5432"
133
+ uv run python cronjob/customer_transactions/agent_run.py
134
+ ```
135
+ Add a script alias in `pyproject.toml` (optional):
136
+ ```toml
137
+ [tool.uv.scripts]
138
+ agent = "python cronjob/customer_transactions/agent_run.py"
139
+ ```
140
+ Then:
141
+ ```bash
142
+ uv run agent
143
+ ```
144
+
145
+ # Usage
146
+ The commands below sassume an activated virtual environment. If you haven't activated your environment and you are using `uv`, you should prefix the commands with `uv run`.
147
+
148
+ ## Online Usage
149
+ ### run reranking evaluation for langfuse traces
150
+ ```bash
151
+ uv run langfuse_trace_evaluation.py
152
+ ```
153
+
154
+ ## Offline Usage
155
+ Evals Hub may be used offline, for development purposes, or as part of a CI/CD pipeline. You can use the `evals-hub` CLI tool to run benchmarks offline. The main entry point is the `run-benchmark` command.
156
+
157
+ <details>
158
+ <summary><b>View options for the evals-hub command</b></summary>
159
+
160
+ ```bash
161
+ evals-hub --help
162
+ ```
163
+
164
+ ```
165
+ Usage: evals-hub COMMAND
166
+
167
+ โ•ญโ”€ Commands โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
168
+ โ”‚ run-benchmark โ”‚
169
+ โ”‚ --help -h Display this message and exit. โ”‚
170
+ โ”‚ --version Display application version. โ”‚
171
+ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
172
+ โ•ญโ”€ Parameters โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
173
+ โ”‚ * --config [required] โ”‚
174
+ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
175
+ ```
176
+ </details>
177
+
178
+
179
+ <details>
180
+ <summary><b>View options for the evals-hub run-benchmark command</b></summary>
181
+
182
+ ```bash
183
+ evals-hub run-benchmark --help
184
+ ```
185
+ ```
186
+ Usage: evals-hub run-benchmark [ARGS] [OPTIONS]
187
+
188
+ โ•ญโ”€ Parameters โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
189
+ โ”‚ * TASK-NAME --task-name [choices: retrieval, reranking, classification, nli] [required] โ”‚
190
+ โ”‚ * DATASET.NAME --dataset.name [required] โ”‚
191
+ โ”‚ DATASET.SPLIT --dataset.split โ”‚
192
+ โ”‚ DATASET.HF-SUBSET --dataset.hf-subset โ”‚
193
+ โ”‚ * MODEL.CHECKPOINT --model.checkpoint [required] โ”‚
194
+ โ”‚ METRICS.MAP --metrics.map Identifier for MAP metric โ”‚
195
+ โ”‚ METRICS.MRR --metrics.mrr Identifier for MRR metric โ”‚
196
+ โ”‚ METRICS.NDCG --metrics.ndcg Identifier for NDCG metric โ”‚
197
+ โ”‚ METRICS.RECALL --metrics.recall Identifier for Recall metric โ”‚
198
+ โ”‚ METRICS.PRECISION --metrics.precision Identifier for Precision metric โ”‚
199
+ โ”‚ METRICS.MICRO-AVG-F1 --metrics.micro-avg-f1 Identifier for micro average F1 metric โ”‚
200
+ โ”‚ METRICS.MACRO-AVG-F1 --metrics.macro-avg-f1 Identifier for macro average F1 metric โ”‚
201
+ โ”‚ METRICS.ACCURACY --metrics.accuracy Identifier for accuracy metric โ”‚
202
+ โ”‚ EVALUATION.TOP-K --evaluation.top-k [default: 10] โ”‚
203
+ โ”‚ EVALUATION.BATCH-SIZE [default: 16] โ”‚
204
+ โ”‚ --evaluation.batch-size โ”‚
205
+ โ”‚ EVALUATION.SEED --evaluation.seed [default: 42] โ”‚
206
+ โ”‚ EVALUATION.MAX-LENGTH โ”‚
207
+ โ”‚ --evaluation.max-length โ”‚
208
+ โ”‚ EVALUATION.SAMPLES-PER-LABEL โ”‚
209
+ โ”‚ --evaluation.samples-per-label โ”‚
210
+ โ”‚ EVALUATION.N-EXPERIMENTS [default: 10] โ”‚
211
+ โ”‚ --evaluation.n-experiments โ”‚
212
+ โ”‚ * OUTPUT.RESULTS-FILE --output.results-file [required] โ”‚
213
+ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
214
+ ```
215
+ </details>
216
+
217
+ \
218
+ Benchmarks can be run in a couple of different ways:
219
+ - options defined in a YAML config file
220
+ - options directly from the command line
221
+ - options defined in a YAML config file which are overridden by the command line
222
+
223
+ **Benchmark configured entirely from a YAML file**
224
+
225
+ ## run reranking
226
+ ```bash
227
+ evals-hub run-benchmark --config reranking_config.yaml
228
+ ```
229
+ ## run nli
230
+ ```bash
231
+ evals-hub run-benchmark --config nli_config.yaml
232
+ ```
233
+
234
+ ## run classification
235
+ ```bash
236
+ evals-hub run-benchmark --config classification_config.yaml
237
+ ```
238
+
239
+ ## run patent landscape evaluation
240
+ ```bash
241
+ evals-hub run-benchmark --config pl_eval_config.yaml
242
+ ```
243
+ ## Troubleshooting SSL errors
244
+ ## SSL errors when connecting to huggingface dataset
245
+ Set environment variable for python library `requests` in `.env`
246
+ ```python
247
+ REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
248
+ ```
249
+ SSL certificates may need to be imported if you have not done before.
250
+
251
+ # Development Setup
252
+ ## Install the git hook scripts
253
+ ```bash
254
+ pre-commit install
255
+ ```
256
+ ## Run tests
257
+ ```bash
258
+ uv run pytest -v
259
+ ```
260
+
261
+ ## SQL Code Quality
262
+ ### Lint all SQL files in a directory:
263
+ ```bash
264
+ uv run sqlfluff lint --dialect postgres cronjob/
265
+ ```
266
+
267
+ ### Format/fix SQL files:
268
+ ```bash
269
+ uv run sqlfluff format --dialect postgres cronjob/
270
+ ```
271
+
272
+ ## Serve documentation locally
273
+ ```bash
274
+ uv run mkdocs serve -f docs/mkdocs.yml
275
+ ```
276
+ Then open up http://127.0.0.1:8000/ in your browser
277
+
278
+ ## Refresh & upgrade the lockfile
279
+ ```bash
280
+ uv sync --upgrade
281
+ ```
282
+
283
+ ### Integration tests
284
+ By default, integration tests are ignored in the pytest configuration because evaluation runs take a long time and require GPU resources. However, it is sometimes useful to run the evaluation to verify that results are correct against public benchmarks.
285
+ And here is the command:
286
+ ```bash
287
+ uv run pytest tests/integration
288
+ ```
289
+
290
+
291
+ ## To run pre-commit hooks locally
292
+ ```bash
293
+ source .venv/bin/activate
294
+ pre-commit run --all-files
295
+ ```
296
+
297
+ ## Show outdated package
298
+ ```bash
299
+ uv tree --outdated --depth 1
300
+ ```
requirements.txt CHANGED
@@ -23,7 +23,7 @@ streamlit-drawable-canvas==0.9.3
23
  streamlit-plotly-events==0.0.6
24
  tifffile>=2023.7.10
25
  timm>=0.9.6
26
- torch>=2.8.0
27
- torchvision>=0.23.0
28
  tqdm>=4.61.1
29
  pooch
 
23
  streamlit-plotly-events==0.0.6
24
  tifffile>=2023.7.10
25
  timm>=0.9.6
26
+ torch==2.4.0
27
+ torchvision==0.19.0
28
  tqdm>=4.61.1
29
  pooch