DINO-Protomorph / README.md
shiowo's picture
Upload ProtoMorph-DINO scaffold and random head checkpoint
63089c1 verified
---
license: cc-by-sa-4.0
library_name: pytorch
pipeline_tag: image-classification
base_model: facebook/dinov3-vits16-pretrain-lvd1689m
tags:
- image-classification
- computer-vision
- dinov3
- pytorch
- safetensors
- prototype-learning
- hard-example-mining
- feedback-routing
- experimental
metrics:
- accuracy
- f1
- precision
- recall
---
# ProtoMorph-DINO
**Feedback-Gated Prototype Morphing for Hard-Case Image Classification**
ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone.
This model card is for the Hugging Face repository:
```text
shiowo/DINO-Protomorph
```
This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are **pending** because the repository is being created before full training and benchmarking.
This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project.
---
## Architecture
```text
Image
Frozen DINOv3
Patch map z0
ProtoMorph block 1
Layer Memory Attention
ProtoMorph block 2
Layer Memory Attention
Main logits
Hard-case gate
├── easy: return main logits
└── hard:
feedback from top-2 probabilities
modulate DINO patch map
run Delta-RBF hard expert
fuse logits
```
---
## Model Summary
ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head.
For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits.
The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier.
---
## Current Status
**Status: research scaffold / pre-training setup**
The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise.
Predictions are **not meaningful** until the ProtoMorph head is trained on a real dataset.
---
## Results
**Evaluation results: Pending**
No benchmark results are reported yet because the repository is being prepared before training and evaluation.
| Metric | Value |
|---|---:|
| Accuracy | Pending |
| F1 | Pending |
| Precision | Pending |
| Recall | Pending |
| Confusion-pair improvement | Pending |
| Hard-case routing benefit | Pending |
Recommended future baselines:
| Baseline | Purpose |
|---|---|
| DINOv3 + Linear Probe | Minimal frozen-backbone baseline |
| DINOv3 + MLP Head | Strong simple head baseline |
| CLIP + Linear Probe | Popular vision-language comparison |
| ConvNeXt | Strong CNN-style baseline |
| ViT | Standard transformer baseline |
---
## Intended Use
This model is intended for:
- image classification research
- hard-example routing experiments
- prototype learning experiments
- frozen-backbone classifier research
- fine-grained classification experiments
- educational computer vision experiments
This model is **not** intended for safety-critical use.
Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation.
---
## Model Files
Recommended repository layout:
```text
.
├── README.md
├── LICENSE-WEIGHTS.md
├── config.json
├── labels.txt
├── checkpoints/
│ ├── config.json
│ ├── labels.txt
│ └── protomorph_head.safetensors
├── infer.py
├── scripts/
│ └── upload_to_hf.py
└── src/
└── protomorph/
```
The main weight file is:
```text
checkpoints/protomorph_head.safetensors
```
This file contains only the custom ProtoMorph classification head.
DINOv3 backbone weights are **not** included in this repository.
---
## Backbone
Default backbone:
```text
facebook/dinov3-vits16-pretrain-lvd1689m
```
The backbone is used as a frozen visual feature extractor.
For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings.
---
## Installation
Recommended environment:
```text
Python 3.11
PyTorch 2.4.0
CUDA 12.4 PyTorch wheel
```
Install PyTorch:
```bash
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124
```
Install dependencies:
```bash
pip install -r requirements-core.txt
```
---
## RunPod Environment Variables
This project supports the RunPod environment variable names shown below:
```text
hf_key=hf_your_huggingface_write_token_here
hf_repo=shiowo/DINO-Protomorph
```
Standard Hugging Face names are also supported:
```text
HF_TOKEN=hf_your_huggingface_write_token_here
HF_REPO_ID=shiowo/DINO-Protomorph
```
Never commit your real Hugging Face token to the repository.
---
## Inference
Run inference from the command line:
```bash
python infer.py \
--image examples/sample_image.jpg \
--config checkpoints/config.json \
--checkpoint checkpoints/protomorph_head.safetensors \
--labels checkpoints/labels.txt \
--topk 5
```
For smoke testing only:
```bash
python infer.py --image examples/sample_image.jpg --allow-random-head
```
If the head is untrained, the output is only useful for checking that the pipeline runs.
---
## Upload to Hugging Face from RunPod
After setting `hf_key` and `hf_repo` in RunPod, run:
```bash
cd /workspace/protomorph_dinov3_runpod
source .venv/bin/activate
python scripts/upload_to_hf.py
```
Or use the helper script:
```bash
bash runpod/upload_to_hf.sh
```
Dry run before upload:
```bash
python scripts/upload_to_hf.py --dry-run
```
---
## Config Example
```json
{
"dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
"num_classes": 10,
"embed_dim": 384,
"patch_size": 16,
"proto_count": 64,
"memory_tokens": 16,
"rbf_count": 128,
"num_heads": 8,
"dropout": 0.0,
"hard_pmax_threshold": 0.65,
"hard_margin_threshold": 0.15,
"hard_entropy_threshold": 1.35,
"image_size": 512,
"use_bf16_autocast": true,
"normalize_patch_tokens": true
}
```
---
## Limitations
Known limitations:
- The architecture is experimental.
- Evaluation results are pending.
- The hard-case gate requires threshold tuning.
- The Delta-RBF hard expert may overfit small datasets.
- Inference may be slower for hard samples.
- The model should be compared against simple baselines before claiming improvement.
- This repository does not include DINOv3 weights.
- The custom head may not generalize outside the dataset it was trained on.
---
## License
The ProtoMorph head weights in this repository are released under:
```text
Creative Commons Attribution-ShareAlike 4.0 International
CC BY-SA 4.0
```
You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license.
This license applies only to the ProtoMorph head weights and related files released in this repository.
It does not apply to:
- DINOv3
- PyTorch
- Hugging Face Transformers
- third-party datasets
- third-party model weights
- upstream dependencies
DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license.
---
## Attribution
If you use this model or build on it, please credit:
```text
ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification
Author: shiowo
Repository: https://huggingface.co/shiowo/DINO-Protomorph
```
BibTeX:
```bibtex
@software{protomorph_dino_2026,
title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification},
author = {shiowo},
year = {2026},
url = {https://huggingface.co/shiowo/DINO-Protomorph}
}
```
---
## Disclaimer
This is a research prototype.
The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.