Andrej Janchevski commited on
Commit
ce3d8f2
Β·
1 Parent(s): 15144da

build(deps): add transitive research deps and rust-accelerated transfer

Browse files

The COINs research code transitively imports a chunk of packages that
were missing from requirements.txt β€” silently filled in by manual
pip-installs in the local mamba env, but absent from the deployment
image. Each one surfaced as a ModuleNotFoundError at gunicorn boot:

- seaborn β€” graph_analysis/metrics.py top-level import
- karateclub, gensim, python-louvain β€” community detection paths
- PyMetis, PyGSP β€” graph partitioning + signal processing
- compress-pickle β€” graph_data/serialization.py
- python-Levenshtein, six, decorator β€” karateclub transitive
- tensorboardX β€” research code logging hooks
- torch-scatter β€” torch_geometric op fallbacks (PyG wheel index)

Also bumps numpy from 1.23 to 1.26.* (matches the verified-working
local env; the older pin clashed with karateclub's overly-strict and
runtime-incorrect numpy<1.23 cap), and adds hf_transfer to enable
HF_HUB_ENABLE_HF_TRANSFER=1 for ~3x faster cold-start downloads.

Files changed (1) hide show
  1. src/backend/requirements.txt +24 -2
src/backend/requirements.txt CHANGED
@@ -11,6 +11,8 @@ gunicorn>=21.2
11
 
12
  # Checkpoint download from Hugging Face Hub (replaces gdown / Google Drive)
13
  huggingface_hub>=0.25
 
 
14
 
15
  # PyTorch with CUDA 11.8 (falls back to CPU at runtime if no GPU present)
16
  --extra-index-url https://download.pytorch.org/whl/cu118
@@ -18,14 +20,19 @@ torch==2.0.1+cu118
18
  torchvision==0.15.2+cu118
19
  torchaudio==2.0.2+cu118
20
 
21
- # PyTorch Geometric (must match torch version)
 
 
22
  torch-geometric==2.3.1
 
23
 
24
  # Research shared deps
25
  pytorch-lightning==2.0.4
26
  hydra-core==1.3.2
27
  omegaconf==2.3.0
28
- numpy==1.23
 
 
29
  pandas>=2.0
30
  scipy==1.11.0
31
  igraph==0.9.11
@@ -36,6 +43,21 @@ imageio==2.31.1
36
  torchmetrics==0.11.4
37
  tqdm==4.65.0
38
  scikit-learn>=1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  # MultiProxAn graph generation
41
  Pillow>=9.5.0
 
11
 
12
  # Checkpoint download from Hugging Face Hub (replaces gdown / Google Drive)
13
  huggingface_hub>=0.25
14
+ # Rust-accelerated file transfer; activated via HF_HUB_ENABLE_HF_TRANSFER=1
15
+ hf_transfer>=0.1
16
 
17
  # PyTorch with CUDA 11.8 (falls back to CPU at runtime if no GPU present)
18
  --extra-index-url https://download.pytorch.org/whl/cu118
 
20
  torchvision==0.15.2+cu118
21
  torchaudio==2.0.2+cu118
22
 
23
+ # PyTorch Geometric (must match torch version) and the scatter ops it falls
24
+ # back to. torch-scatter wheel is published in the PyG index alongside.
25
+ --find-links https://data.pyg.org/whl/torch-2.0.1+cu118.html
26
  torch-geometric==2.3.1
27
+ torch-scatter==2.1.2+pt20cu118
28
 
29
  # Research shared deps
30
  pytorch-lightning==2.0.4
31
  hydra-core==1.3.2
32
  omegaconf==2.3.0
33
+ # Pin to 1.26 β€” matches the verified-working local env. Older 1.23 pin
34
+ # clashes with karateclub's overly-strict (and runtime-incorrect) numpy<1.23.
35
+ numpy==1.26.*
36
  pandas>=2.0
37
  scipy==1.11.0
38
  igraph==0.9.11
 
43
  torchmetrics==0.11.4
44
  tqdm==4.65.0
45
  scikit-learn>=1.0
46
+ seaborn>=0.13
47
+ tensorboardX>=2.6
48
+
49
+ # Graph utilities used by COINs (community detection, partitioning, sampling).
50
+ # karateclub itself is installed --no-deps in the Dockerfile because its
51
+ # pyproject pins numpy<1.23 which we know is unnecessary in practice; its
52
+ # real runtime deps are listed individually here.
53
+ gensim>=4.4
54
+ python-louvain>=0.16
55
+ PyMetis>=2023.1
56
+ PyGSP>=0.6
57
+ compress-pickle>=2.1
58
+ python-Levenshtein>=0.27
59
+ six>=1.16
60
+ decorator>=4.4
61
 
62
  # MultiProxAn graph generation
63
  Pillow>=9.5.0