vascx-fork / README.md
zyf0717's picture
Enhance device selection and logging for inference; add end-to-end tests
1386847
---
license: agpl-3.0
pipeline_tag: image-segmentation
tags:
- medical
- biology
---
# VascX Fork
This repository is `zyf0717/vascx-fork`, a personal Hugging Face fork of the original [`Eyened/vascx`](https://huggingface.co/Eyened/vascx) release.
It was cloned from the upstream VascX repository on April 20, 2026, and the work in this fork was continued from that cloned baseline.
It now serves as a self-contained fork for running the VascX retinal fundus analysis pipeline, with:
- the VascX model weights tracked in Git LFS
- the Python package used to run preprocessing and inference from this repo
- fork-specific packaging and runtime fixes
- a root `config.yaml` for controlling overlay layers, colors, disc circles, and vessel-width sampling
This is not the canonical upstream repository. The upstream project remains `Eyened/vascx`.
## What Stays Compatible
- The Python package name is still `vascx_models`
- The model layout and output structure are kept compatible with the upstream VascX workflow
## What Changed In This Fork
- Repository identity is now `vascx-fork`
- The default conda environment name in `environment.yml` and `run.sh` is `vascx-fork`
- The legacy `setup.py` and installed `vascx` console script were removed
- Supported entrypoints are `./run.sh` and `python -m vascx_models`
- Overlay generation can now be configured from the root `config.yaml`
- Inference device selection is automatic by default and can be overridden explicitly
- Local helper scripts and docs were updated to point at this fork instead of the upstream Hub repo
- Generated outputs, caches, and other non-repository artifacts are excluded from version control
## Installation
1. Install Git LFS and enable it for your machine:
```bash
git lfs install
```
2. Create an environment. The included environment file uses the fork name:
```bash
conda env create -f environment.yml
conda activate vascx-fork
```
If you update `environment.yml` later, refresh the env with:
```bash
conda env update -f environment.yml --prune
```
If you are managing your own environment instead of using `environment.yml`, install `torch`, `torchvision`, `retinalysis-fundusprep`, and `retinalysis-inference` before running the package.
## Quick Start
Run the full pipeline:
```bash
./run.sh
```
`run.sh` activates the `vascx-fork` conda environment, defaults to the bundled sample images, and writes to a timestamped `output_YYYYMMDD_HHMMSS/` directory. You can override the main runtime inputs with environment variables:
```bash
INPUT_PATH=/path/to/images OUTPUT_PATH=/path/to/output N_JOBS=4 ./run.sh
DEVICE=cpu INPUT_PATH=/path/to/images OUTPUT_PATH=/path/to/output ./run.sh
./run.sh --sample-run
```
The standard Python entrypoint is:
```bash
python -m vascx_models run DATA_PATH OUTPUT_PATH
```
Both entrypoints auto-configure the local cache and model-release directories from the repository checkout.
`DATA_PATH` can be:
- a directory of fundus images
- a CSV file with a `path` column
Typical examples:
```bash
./run.sh --sample-run
python -m vascx_models run /path/to/images /path/to/output
python -m vascx_models run /path/to/image_list.csv /path/to/output
python -m vascx_models run /path/to/preprocessed/images /path/to/output --no-preprocess
python -m vascx_models run /path/to/images /path/to/output --device auto
python -m vascx_models run /path/to/images /path/to/output --device cpu
python -m vascx_models run /path/to/images /path/to/output --no-disc --no-quality --no-fovea --no-overlay
python -m vascx_models run /path/to/images /path/to/output --no-vessels
```
## Device Selection
Inference device selection is automatic by default.
- `--device auto` is the default for `python -m vascx_models run`
- `DEVICE=auto` is the default for `./run.sh`
- Auto-selection priority is `cuda` first, then Apple Metal `mps`, then `cpu`
- The CLI logs detected availability as `cuda=...`, `mps=...`, `cpu=True`
- The CLI also logs the selected device for each run
- You can force a backend with `--device cuda`, `--device mps`, or `--device cpu`
- `./run.sh` forwards the `DEVICE` environment variable to the Python CLI
- If you request `cuda` or `mps` explicitly and that backend is unavailable, the run exits with a clear error instead of silently falling back
## Configuration
This fork adds a root-level `config.yaml` for overlay behavior, disc-circle generation, and vessel-width sampling.
If `config.yaml` exists in the current working directory, it is loaded first. Otherwise the repository-root `config.yaml` is used when present. You can also pass a specific file:
```bash
python -m vascx_models run DATA_PATH OUTPUT_PATH --config /path/to/config.yaml
```
The repository ships with this `config.yaml`:
```yaml
overlay:
enabled: true
layers:
arteries: true
veins: true
disc: true
fovea: true
vessel_widths: true
colours:
artery: "#FF0000"
vein: "#0000FF"
vessel: "#00FF00"
disc: "#FFFFFF"
fovea: "#FFFF00"
vessel_widths: "#00FF00"
circles:
- name: "2r"
diameter: 2.0
color: "#00FF00"
- name: "3r"
diameter: 3.0
color: "#00FF00"
vessel_widths:
inner_circle: "2r"
outer_circle: "3r"
samples_per_connection: 5
```
Notes:
- `overlay.enabled` sets the default overlay behavior when `--overlay/--no-overlay` is not passed
- `overlay.layers` controls which predictions are drawn
- `overlay.colors` and `overlay.colours` are both accepted
- `overlay.circles` controls how many disc circles are generated and their diameters
- Each circle entry requires `name` and `diameter`
- Circle `color` and `colour` are both accepted and default to black when omitted
- `overlay.layers.vessel_widths` controls whether sampled width segments are drawn in overlays
- `overlay.colors.vessel_widths` controls the measurement-line color in overlays
- `vessel_widths.inner_circle` and `vessel_widths.outer_circle` choose the circle pair used for sampling; when omitted, the two smallest valid circles are used
- `vessel_widths.samples_per_connection` sets how many evenly spaced interior points are measured along each simple vessel path between the two circles
- Colors can be written as `#RRGGBB` strings or RGB arrays such as `[255, 0, 0]`
- If no config file is found, the built-in defaults still use `2r` and `3r` circles with vessel-width sampling enabled
## Outputs
With the default pipeline settings, `OUTPUT_PATH` contains:
```text
OUTPUT_PATH/
β”œβ”€β”€ preprocessed_rgb/
β”œβ”€β”€ vessels/
β”œβ”€β”€ artery_vein/
β”œβ”€β”€ disc/
β”œβ”€β”€ disc_circles/
β”œβ”€β”€ overlays/
β”œβ”€β”€ bounds.csv
β”œβ”€β”€ disc_geometry.csv
β”œβ”€β”€ vessel_widths.csv
β”œβ”€β”€ quality.csv
└── fovea.csv
```
`disc_circles/` contains one subdirectory per configured circle name.
`vessel_widths.csv` is written when both vessel/AV and disc outputs are available. Each row represents one sampled measurement along a retained artery or vein connection between the configured inner and outer circles, with these columns:
- `image_id`
- `inner_circle`, `outer_circle`
- `inner_circle_radius_px`, `outer_circle_radius_px`
- `connection_index`, `sample_index`
- `x`, `y`
- `width_px`
- `x_start`, `y_start`, `x_end`, `y_end`
- `vessel_type`
Current measurement behavior is intentionally conservative:
- only simple open skeleton paths that connect one inner-circle boundary point to one outer-circle boundary point are measured
- branched, looping, ambiguous, or zero-length annulus components are skipped
- if a sampled point on a retained connection fails width estimation, that entire connection is dropped from the CSV and overlay
## Repository Contents
- `vascx_models/`: package source and CLI
- `artery_vein/`, `disc/`, `fovea/`, `vessels/`, `quality/`, `odfd/`, `discedge/`: model artifacts
- `config.yaml`: fork-specific overlay configuration
- `pytest.ini`: pytest marker definitions for slow and end-to-end tests
- `run.sh`: primary local runner
- `tests/`: pytest suite
- `notebooks/`: preprocessing and inference examples
## Testing
The test suite includes unit tests, CLI tests, and an opt-in real-model single-image end-to-end smoke test in `tests/test_e2e.py`.
Useful commands:
```bash
conda run -n vascx-fork pytest
KMP_DUPLICATE_LIB_OK=TRUE conda run -n vascx-fork pytest tests/test_e2e.py -q
KMP_DUPLICATE_LIB_OK=TRUE VASCX_RUN_E2E=1 conda run -n vascx-fork pytest tests/test_e2e.py -q -k cpu
```
Explicitly tested in this fork as of April 21, 2026:
- README and CLI/config behavior updates are covered by the regular pytest suite
- device resolution priority and explicit unavailable-device failures are covered by unit tests
- the real single-image end-to-end pipeline was run successfully on CPU with preprocessing enabled
- the end-to-end test is parameterized for `cpu`, `cuda`, and `mps`, but actual `cuda` and `mps` execution were not exercised in this workspace because those backends were unavailable
## Upstream Reference
Original upstream project:
- Hugging Face: <https://huggingface.co/Eyened/vascx>
- Paper: <https://arxiv.org/abs/2409.16016>
This forked repository:
- Hugging Face: <https://huggingface.co/zyf0717/vascx-fork>