--- license: apache-2.0 library_name: pytorch tags: - wildfire - smoke-detection - object-detection - temporal --- # Temporal Smoke Model (bbox-tube-temporal) > **Latest release:** [`v0.2.0`](https://huggingface.co/pyronear/temporal-model/tree/v0.2.0) — pin this revision for reproducibility, or omit `revision=` to always get the latest. All releases: the **Files and versions** tab. A temporal wildfire-**smoke** classifier for short sequences of camera frames. A YOLO detector proposes boxes, boxes are linked across frames into temporal **tubes**, each tube's image patches are classified by a DINOv2 ViT + transformer head, and a logistic calibrator turns the tube logits into a calibrated probability and a keep/discard decision. This repo ships a single self-contained **`model.zip`**, versioned by HuggingFace revision/tag (`v`). Each release bundles everything needed to run: | file | purpose | |---|---| | `manifest.yaml` | version + provenance (train git SHA, backbone, detector) | | `yolo_weights.pt` | the companion YOLO detector | | `classifier.ckpt` | the temporal ViT classifier | | `config.yaml` | inference + decision config | | `logistic_calibrator.json` | the calibrated decision head | The model runs YOLO **itself** — you pass only raw frames, no detections. ## Usage Install the inference package (`temporal_model.core`): ```bash pip install "git+https://github.com/pyronear/temporal-model.git#subdirectory=core" ``` Download a versioned `model.zip` and run it on a **temporally ordered** sequence of frames: ```python from pathlib import Path from huggingface_hub import hf_hub_download from temporal_model.core.model import BboxTubeTemporalModel # 1. Download a specific release (pin the revision). model_zip = hf_hub_download("pyronear/temporal-model", "model.zip", revision="v0.2.0") # 2. Temporally-ordered frames. Filenames carry timestamps # (_.jpg); the order is the time order. frame_paths = sorted(Path("my_sequence").glob("*.jpg")) # 3. Load (device=None → auto cuda → mps → cpu) and predict. # hf_hub_download returns a str, so wrap it in Path(). model = BboxTubeTemporalModel.from_package(Path(model_zip), device=None) out = model.predict_sequence(frame_paths) print("is_smoke: ", out.is_positive) print("trigger_frame_index:", out.trigger_frame_index) # 0-based; None if no smoke # Per-tube breakdown (logits, calibrated probabilities, bboxes, decision). kept = out.details.get("tubes", {}).get("kept", []) print("kept tubes: ", len(kept)) ``` `predict_sequence(frame_paths)` returns a `TemporalModelOutput`: - `is_positive: bool` — the smoke verdict. - `trigger_frame_index: int | None` — 0-based frame where smoke first crosses the decision threshold (time-to-detection, in frames; `None` when no smoke). - `details: dict` — per-tube logits, calibrated probabilities, bboxes, and the decision (`aggregation`, `threshold`, trigger tube). ## Served API (Docker) The same model is also served as a FastAPI image with the `model.zip` baked in (auto-uses the GPU with `--gpus all`): ```bash docker run --gpus all -p 8000:8000 \ -e TEMPORAL_API_S3_BUCKET= \ -e TEMPORAL_API_S3_ENDPOINT_URL= \ pyronear/temporal-model-api:0.2.0 # POST /predict {"frames": ["", ...]} GET /health ``` ## Provenance Every `model.zip` manifest records how it was built — the training git SHA, the classifier backbone (`vit_small_patch14_dinov2.lvd142m`), and the exact companion detector (e.g. `pyronear/yolo11s_nimble-narwhal_v6.0.0`, verified by SHA-256). So a served model always traces back to its detector + training code. Source & pipeline: