--- license: agpl-3.0 pipeline_tag: image-segmentation tags: - medical - biology --- # VascX Fork This repository is `zyf0717/vascx-fork`, a personal Hugging Face fork of the original [`Eyened/vascx`](https://huggingface.co/Eyened/vascx) release. It was cloned from the upstream VascX repository on April 20, 2026, and the work in this fork was continued from that cloned baseline. It now serves as a self-contained fork for running the VascX retinal fundus analysis pipeline, with: - the VascX model weights tracked in Git LFS - the Python package used to run preprocessing and inference from this repo - fork-specific packaging and runtime fixes - a root `config.yaml` for controlling overlay layers, colors, disc circles, and vessel-width sampling This is not the canonical upstream repository. The upstream project remains `Eyened/vascx`. ## What Stays Compatible - The Python package name is still `vascx_models` - The model layout and output structure are kept compatible with the upstream VascX workflow ## What Changed In This Fork - Repository identity is now `vascx-fork` - The default conda environment name in `environment.yml` and `run.sh` is `vascx-fork` - The legacy `setup.py` and installed `vascx` console script were removed - Supported entrypoints are `./run.sh` and `python -m vascx_models` - Overlay generation can now be configured from the root `config.yaml` - Inference device selection is automatic by default and can be overridden explicitly - Local helper scripts and docs were updated to point at this fork instead of the upstream Hub repo - Generated outputs, caches, and other non-repository artifacts are excluded from version control ## Installation 1. Install Git LFS and enable it for your machine: ```bash git lfs install ``` 2. Create an environment. The included environment file uses the fork name: ```bash conda env create -f environment.yml conda activate vascx-fork ``` If you update `environment.yml` later, refresh the env with: ```bash conda env update -f environment.yml --prune ``` If you are managing your own environment instead of using `environment.yml`, install `torch`, `torchvision`, `retinalysis-fundusprep`, and `retinalysis-inference` before running the package. ## Quick Start Run the full pipeline: ```bash ./run.sh ``` `run.sh` activates the `vascx-fork` conda environment, defaults to the bundled sample images, and writes to a timestamped `output_YYYYMMDD_HHMMSS/` directory. You can override the main runtime inputs with environment variables: ```bash INPUT_PATH=/path/to/images OUTPUT_PATH=/path/to/output N_JOBS=4 ./run.sh DEVICE=cpu INPUT_PATH=/path/to/images OUTPUT_PATH=/path/to/output ./run.sh ./run.sh --sample-run ``` The standard Python entrypoint is: ```bash python -m vascx_models run DATA_PATH OUTPUT_PATH ``` Both entrypoints auto-configure the local cache and model-release directories from the repository checkout. `DATA_PATH` can be: - a directory of fundus images - a CSV file with a `path` column Typical examples: ```bash ./run.sh --sample-run python -m vascx_models run /path/to/images /path/to/output python -m vascx_models run /path/to/image_list.csv /path/to/output python -m vascx_models run /path/to/preprocessed/images /path/to/output --no-preprocess python -m vascx_models run /path/to/images /path/to/output --device auto python -m vascx_models run /path/to/images /path/to/output --device cpu python -m vascx_models run /path/to/images /path/to/output --no-disc --no-quality --no-fovea --no-overlay python -m vascx_models run /path/to/images /path/to/output --no-vessels ``` ## Device Selection Inference device selection is automatic by default. - `--device auto` is the default for `python -m vascx_models run` - `DEVICE=auto` is the default for `./run.sh` - Auto-selection priority is `cuda` first, then Apple Metal `mps`, then `cpu` - The CLI logs detected availability as `cuda=...`, `mps=...`, `cpu=True` - The CLI also logs the selected device for each run - You can force a backend with `--device cuda`, `--device mps`, or `--device cpu` - `./run.sh` forwards the `DEVICE` environment variable to the Python CLI - If you request `cuda` or `mps` explicitly and that backend is unavailable, the run exits with a clear error instead of silently falling back ## Configuration This fork adds a root-level `config.yaml` for overlay behavior, disc-circle generation, and vessel-width sampling. If `config.yaml` exists in the current working directory, it is loaded first. Otherwise the repository-root `config.yaml` is used when present. You can also pass a specific file: ```bash python -m vascx_models run DATA_PATH OUTPUT_PATH --config /path/to/config.yaml ``` The repository ships with this `config.yaml`: ```yaml overlay: enabled: true layers: arteries: true veins: true disc: true fovea: true vessel_widths: true colours: artery: "#FF0000" vein: "#0000FF" vessel: "#00FF00" disc: "#FFFFFF" fovea: "#FFFF00" vessel_widths: "#00FF00" circles: - name: "2r" diameter: 2.0 color: "#00FF00" - name: "3r" diameter: 3.0 color: "#00FF00" vessel_widths: inner_circle: "2r" outer_circle: "3r" samples_per_connection: 5 ``` Notes: - `overlay.enabled` sets the default overlay behavior when `--overlay/--no-overlay` is not passed - `overlay.layers` controls which predictions are drawn - `overlay.colors` and `overlay.colours` are both accepted - `overlay.circles` controls how many disc circles are generated and their diameters - Each circle entry requires `name` and `diameter` - Circle `color` and `colour` are both accepted and default to black when omitted - `overlay.layers.vessel_widths` controls whether sampled width segments are drawn in overlays - `overlay.colors.vessel_widths` controls the measurement-line color in overlays - `vessel_widths.inner_circle` and `vessel_widths.outer_circle` choose the circle pair used for sampling; when omitted, the two smallest valid circles are used - `vessel_widths.samples_per_connection` sets how many evenly spaced interior points are measured along each simple vessel path between the two circles - Colors can be written as `#RRGGBB` strings or RGB arrays such as `[255, 0, 0]` - If no config file is found, the built-in defaults still use `2r` and `3r` circles with vessel-width sampling enabled ## Outputs With the default pipeline settings, `OUTPUT_PATH` contains: ```text OUTPUT_PATH/ ├── preprocessed_rgb/ ├── vessels/ ├── artery_vein/ ├── disc/ ├── disc_circles/ ├── overlays/ ├── bounds.csv ├── disc_geometry.csv ├── vessel_widths.csv ├── quality.csv └── fovea.csv ``` `disc_circles/` contains one subdirectory per configured circle name. `vessel_widths.csv` is written when both vessel/AV and disc outputs are available. Each row represents one sampled measurement along a retained artery or vein connection between the configured inner and outer circles, with these columns: - `image_id` - `inner_circle`, `outer_circle` - `inner_circle_radius_px`, `outer_circle_radius_px` - `connection_index`, `sample_index` - `x`, `y` - `width_px` - `x_start`, `y_start`, `x_end`, `y_end` - `vessel_type` Current measurement behavior is intentionally conservative: - only simple open skeleton paths that connect one inner-circle boundary point to one outer-circle boundary point are measured - branched, looping, ambiguous, or zero-length annulus components are skipped - if a sampled point on a retained connection fails width estimation, that entire connection is dropped from the CSV and overlay ## Repository Contents - `vascx_models/`: package source and CLI - `artery_vein/`, `disc/`, `fovea/`, `vessels/`, `quality/`, `odfd/`, `discedge/`: model artifacts - `config.yaml`: fork-specific overlay configuration - `pytest.ini`: pytest marker definitions for slow and end-to-end tests - `run.sh`: primary local runner - `tests/`: pytest suite - `notebooks/`: preprocessing and inference examples ## Testing The test suite includes unit tests, CLI tests, and an opt-in real-model single-image end-to-end smoke test in `tests/test_e2e.py`. Useful commands: ```bash conda run -n vascx-fork pytest KMP_DUPLICATE_LIB_OK=TRUE conda run -n vascx-fork pytest tests/test_e2e.py -q KMP_DUPLICATE_LIB_OK=TRUE VASCX_RUN_E2E=1 conda run -n vascx-fork pytest tests/test_e2e.py -q -k cpu ``` Explicitly tested in this fork as of April 21, 2026: - README and CLI/config behavior updates are covered by the regular pytest suite - device resolution priority and explicit unavailable-device failures are covered by unit tests - the real single-image end-to-end pipeline was run successfully on CPU with preprocessing enabled - the end-to-end test is parameterized for `cpu`, `cuda`, and `mps`, but actual `cuda` and `mps` execution were not exercised in this workspace because those backends were unavailable ## Upstream Reference Original upstream project: - Hugging Face: - Paper: This forked repository: - Hugging Face: