Update README.md
Browse files
README.md
CHANGED
|
@@ -25,9 +25,9 @@ license_link: LICENSE
|
|
| 25 |
**CinemaCLIP** is a [MobileCLIP-S1](https://huggingface.co/apple/MobileCLIP-S1-OpenCLIP) fine-tune specialized for understanding the visual language of cinema at a frame level. It is a hybrid CLIP model with 23 classifier heads that represent a comprehensive taxonomy built with domain experts. For more info, see our [launch blog post](https://www.ozu.ai/cinemaclip).
|
| 26 |
|
| 27 |
This repository ships three serialized forms of the same model:
|
| 28 |
-
- **Torch** (`model.safetensors`)
|
| 29 |
-
- **CoreML** (`ImageEncoder.mlmodel`, `ImageEncoder.mlpackage` and `TextEncoder.mlpackage`)
|
| 30 |
-
- **ONNX** (`ImageEncoder.onnx`, `TextEncoder.onnx`, plus `_fp16` variants)
|
| 31 |
|
| 32 |
## Install
|
| 33 |
|
|
@@ -167,9 +167,12 @@ The `shot.lighting.direction` head ships in the classifier heads but has been ex
|
|
| 167 |
|
| 168 |
```bibtex
|
| 169 |
@misc{cinemaclip2026,
|
| 170 |
-
title
|
| 171 |
-
author
|
| 172 |
-
year
|
| 173 |
-
|
|
|
|
|
|
|
|
|
|
| 174 |
}
|
| 175 |
```
|
|
|
|
| 25 |
**CinemaCLIP** is a [MobileCLIP-S1](https://huggingface.co/apple/MobileCLIP-S1-OpenCLIP) fine-tune specialized for understanding the visual language of cinema at a frame level. It is a hybrid CLIP model with 23 classifier heads that represent a comprehensive taxonomy built with domain experts. For more info, see our [launch blog post](https://www.ozu.ai/cinemaclip).
|
| 26 |
|
| 27 |
This repository ships three serialized forms of the same model:
|
| 28 |
+
- **Torch** (`model.safetensors`): load via the `cinemaclip` Python package.
|
| 29 |
+
- **CoreML** (`ImageEncoder.mlmodel`, `ImageEncoder.mlpackage` and `TextEncoder.mlpackage`): on-device Apple Neural Engine inference.
|
| 30 |
+
- **ONNX** (`ImageEncoder.onnx`, `TextEncoder.onnx`, plus `_fp16` variants): cross-platform inference.
|
| 31 |
|
| 32 |
## Install
|
| 33 |
|
|
|
|
| 167 |
|
| 168 |
```bibtex
|
| 169 |
@misc{cinemaclip2026,
|
| 170 |
+
title = {CinemaCLIP: A hybrid CLIP model and taxonomy for the visual language of cinema},
|
| 171 |
+
author = {Somani, Rahul and Marini, Anton and Stewart, Damian},
|
| 172 |
+
year = {2026},
|
| 173 |
+
publisher = {HuggingFace},
|
| 174 |
+
doi = {10.57967/hf/8539},
|
| 175 |
+
howpublished = {\url{https://huggingface.co/OZU-Technology/CinemaCLIP}},
|
| 176 |
+
note = {Model weights and taxonomy}
|
| 177 |
}
|
| 178 |
```
|