krasserm
/

perceiver-io-optical-flow

Transformers

PyTorch

perceiver-io-optical-flow

Model card Files Files and versions

xet

Community

krasserm commited on Apr 23, 2023

Commit

1b5f14d

1 Parent(s): 3f5c05f

Create README.md

Browse files

Files changed (1) hide show

README.md +119 -0

README.md ADDED Viewed

	@@ -0,0 +1,119 @@

+---
+license: apache-2.0
+inference: false
+datasets:
+- autoflow
+---
+# Perceiver IO optical flow model
+This model is a Perceiver IO optical flow model pretrained on [AutoFlow](https://autoflow-google.github.io/).
+It is weight-equivalent to the [deepmind/optical-flow-perceiver](https://huggingface.co/deepmind/optical-flow-perceiver)
+model but based on implementation classes of the [perceiver-io](https://github.com/krasserm/perceiver-io) library. It
+can be created from the `deepmind/optical-flow-perceiver` model with a library-specific [conversion utility](#model-conversion).
+Both models generate equal output for the same input.
+Content of the `deepmind/optical-flow-perceiver` [model card](https://huggingface.co/deepmind/optical-flow-perceiver)
+also applies to this model except [usage examples](#usage-examples). Refer to the linked card for further model and
+training details.
+## Model description
+The model is specified in Appendix H (Table 16) of the [Perceiver IO paper](https://arxiv.org/abs/2107.14795).
+## Intended use and limitations
+The model can be used to predict the optical flow between a pair of images.
+## Usage examples
+To use this model you first need to [install](https://github.com/krasserm/perceiver-io/blob/main/README.md#installation)
+the `perceiver-io` library with extension `vision`.
+```shell
+pip install perceiver-io[vision]
+```
+Then the model can be used with PyTorch.
+### Image pair
+The following example uses this image pair as input
+<img src="https://martin-krasser.com/perceiver/flow/frame_0047.png" alt="image-1" width="500"/>
+<img src="https://martin-krasser.com/perceiver/flow/frame_0048.png" alt="image-2" width="500"/>
+and renders their optical flow as HSV representation (`render=True`):
+```python
+import requests
+from PIL import Image
+from transformers import pipeline
+from perceiver.model.vision import optical_flow  # register optical flow pipeline
+frame_1 = Image.open(requests.get("https://martin-krasser.com/perceiver/flow/frame_0047.png", stream=True).raw)
+frame_2 = Image.open(requests.get("https://martin-krasser.com/perceiver/flow/frame_0048.png", stream=True).raw)
+optical_flow_pipeline = pipeline("optical-flow", model="krasserm/perceiver-io-optical-flow", device="cuda:0")
+rendered_optical_flow = optical_flow_pipeline((frame_1, frame_2), render=True)
+Image.fromarray(rendered_optical_flow).save("optical_flow.png")
+```
+The [rendered optical flow](https://martin-krasser.com/perceiver/flow/optical_flow.png) is
+<img src="https://martin-krasser.com/perceiver/flow/optical_flow.png" alt="image-2" width="500"/>
+### Video
+To compute the optical flow of an entire video, the `optical-flow` pipeline can be used in combination with functions
+from `video_utils`. The following code samples all frames from a [video snippet](https://martin-krasser.com/perceiver/flow/sintel_clip_cave_dragon_fight.mp4)
+taken from the [Sintel animated short movie](https://durian.blender.org/), computes the optical flow per consecutive
+frame pair and writes the rendered results back to an output video file.
+```python
+from transformers import pipeline
+from perceiver.data.vision import video_utils
+from perceiver.model.vision import optical_flow  # register optical flow pipeline
+optical_flow_pipeline = pipeline("optical-flow", model="krasserm/perceiver-io-optical-flow", device="cuda:0")
+# sample consecutive video frame pairs
+frame_pairs = video_utils.read_video_frame_pairs("sintel_clip_cave_dragon_fight.mp4")
+# create and render optical flow for all frame pairs
+optical_flows = optical_flow_pipeline(frame_pairs, render=True, device="cuda:0")
+# create video with rendered optical flows
+video_utils.write_video("sintel_clip_cave_dragon_fight_output.mp4", optical_flows, fps=24)
+```
+A side-by-side comparison of the input and output video is:
+![optical-flow-sbs](https://martin-krasser.com/perceiver/flow/sintel_clip_cave_dragon_fight_side_by_side_horizontal.gif)
+## Model conversion
+The `krasserm/perceiver-io-optical-flow` model has been created from the source `deepmind/optical-flow-perceiver` model
+with:
+```python
+from perceiver.model.vision.optical_flow import convert_model
+convert_model(
+    save_dir="krasserm/perceiver-io-optical-flow",
+    source_repo_id="deepmind/optical-flow-perceiver",
+    push_to_hub=True,
+)
+```
+## Citation
+```bibtex
+@article{jaegle2021perceiver,
+  title={Perceiver IO: A General Architecture for Structured Inputs \& Outputs},
+  author={Jaegle, Andrew and Borgeaud, Sebastian and Alayrac, Jean-Baptiste and Doersch, Carl and Ionescu, Catalin and Ding, David and Koppula, Skanda and Zoran, Daniel and Brock, Andrew and Shelhamer, Evan and others},
+  journal={arXiv preprint arXiv:2107.14795},
+  year={2021}
+}
+```