File size: 8,439 Bytes
31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 63089c1 31886b5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 | ---
license: cc-by-sa-4.0
library_name: pytorch
pipeline_tag: image-classification
base_model: facebook/dinov3-vits16-pretrain-lvd1689m
tags:
- image-classification
- computer-vision
- dinov3
- pytorch
- safetensors
- prototype-learning
- hard-example-mining
- feedback-routing
- experimental
metrics:
- accuracy
- f1
- precision
- recall
---
# ProtoMorph-DINO
**Feedback-Gated Prototype Morphing for Hard-Case Image Classification**
ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone.
This model card is for the Hugging Face repository:
```text
shiowo/DINO-Protomorph
```
This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are **pending** because the repository is being created before full training and benchmarking.
This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project.
---
## Architecture
```text
Image
β
Frozen DINOv3
β
Patch map z0
β
ProtoMorph block 1
β
Layer Memory Attention
β
ProtoMorph block 2
β
Layer Memory Attention
β
Main logits
β
Hard-case gate
βββ easy: return main logits
βββ hard:
feedback from top-2 probabilities
modulate DINO patch map
run Delta-RBF hard expert
fuse logits
```
---
## Model Summary
ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head.
For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits.
The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier.
---
## Current Status
**Status: research scaffold / pre-training setup**
The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise.
Predictions are **not meaningful** until the ProtoMorph head is trained on a real dataset.
---
## Results
**Evaluation results: Pending**
No benchmark results are reported yet because the repository is being prepared before training and evaluation.
| Metric | Value |
|---|---:|
| Accuracy | Pending |
| F1 | Pending |
| Precision | Pending |
| Recall | Pending |
| Confusion-pair improvement | Pending |
| Hard-case routing benefit | Pending |
Recommended future baselines:
| Baseline | Purpose |
|---|---|
| DINOv3 + Linear Probe | Minimal frozen-backbone baseline |
| DINOv3 + MLP Head | Strong simple head baseline |
| CLIP + Linear Probe | Popular vision-language comparison |
| ConvNeXt | Strong CNN-style baseline |
| ViT | Standard transformer baseline |
---
## Intended Use
This model is intended for:
- image classification research
- hard-example routing experiments
- prototype learning experiments
- frozen-backbone classifier research
- fine-grained classification experiments
- educational computer vision experiments
This model is **not** intended for safety-critical use.
Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation.
---
## Model Files
Recommended repository layout:
```text
.
βββ README.md
βββ LICENSE-WEIGHTS.md
βββ config.json
βββ labels.txt
βββ checkpoints/
β βββ config.json
β βββ labels.txt
β βββ protomorph_head.safetensors
βββ infer.py
βββ scripts/
β βββ upload_to_hf.py
βββ src/
βββ protomorph/
```
The main weight file is:
```text
checkpoints/protomorph_head.safetensors
```
This file contains only the custom ProtoMorph classification head.
DINOv3 backbone weights are **not** included in this repository.
---
## Backbone
Default backbone:
```text
facebook/dinov3-vits16-pretrain-lvd1689m
```
The backbone is used as a frozen visual feature extractor.
For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings.
---
## Installation
Recommended environment:
```text
Python 3.11
PyTorch 2.4.0
CUDA 12.4 PyTorch wheel
```
Install PyTorch:
```bash
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124
```
Install dependencies:
```bash
pip install -r requirements-core.txt
```
---
## RunPod Environment Variables
This project supports the RunPod environment variable names shown below:
```text
hf_key=hf_your_huggingface_write_token_here
hf_repo=shiowo/DINO-Protomorph
```
Standard Hugging Face names are also supported:
```text
HF_TOKEN=hf_your_huggingface_write_token_here
HF_REPO_ID=shiowo/DINO-Protomorph
```
Never commit your real Hugging Face token to the repository.
---
## Inference
Run inference from the command line:
```bash
python infer.py \
--image examples/sample_image.jpg \
--config checkpoints/config.json \
--checkpoint checkpoints/protomorph_head.safetensors \
--labels checkpoints/labels.txt \
--topk 5
```
For smoke testing only:
```bash
python infer.py --image examples/sample_image.jpg --allow-random-head
```
If the head is untrained, the output is only useful for checking that the pipeline runs.
---
## Upload to Hugging Face from RunPod
After setting `hf_key` and `hf_repo` in RunPod, run:
```bash
cd /workspace/protomorph_dinov3_runpod
source .venv/bin/activate
python scripts/upload_to_hf.py
```
Or use the helper script:
```bash
bash runpod/upload_to_hf.sh
```
Dry run before upload:
```bash
python scripts/upload_to_hf.py --dry-run
```
---
## Config Example
```json
{
"dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
"num_classes": 10,
"embed_dim": 384,
"patch_size": 16,
"proto_count": 64,
"memory_tokens": 16,
"rbf_count": 128,
"num_heads": 8,
"dropout": 0.0,
"hard_pmax_threshold": 0.65,
"hard_margin_threshold": 0.15,
"hard_entropy_threshold": 1.35,
"image_size": 512,
"use_bf16_autocast": true,
"normalize_patch_tokens": true
}
```
---
## Limitations
Known limitations:
- The architecture is experimental.
- Evaluation results are pending.
- The hard-case gate requires threshold tuning.
- The Delta-RBF hard expert may overfit small datasets.
- Inference may be slower for hard samples.
- The model should be compared against simple baselines before claiming improvement.
- This repository does not include DINOv3 weights.
- The custom head may not generalize outside the dataset it was trained on.
---
## License
The ProtoMorph head weights in this repository are released under:
```text
Creative Commons Attribution-ShareAlike 4.0 International
CC BY-SA 4.0
```
You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license.
This license applies only to the ProtoMorph head weights and related files released in this repository.
It does not apply to:
- DINOv3
- PyTorch
- Hugging Face Transformers
- third-party datasets
- third-party model weights
- upstream dependencies
DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license.
---
## Attribution
If you use this model or build on it, please credit:
```text
ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification
Author: shiowo
Repository: https://huggingface.co/shiowo/DINO-Protomorph
```
BibTeX:
```bibtex
@software{protomorph_dino_2026,
title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification},
author = {shiowo},
year = {2026},
url = {https://huggingface.co/shiowo/DINO-Protomorph}
}
```
---
## Disclaimer
This is a research prototype.
The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.
|