Upload ProtoMorph-DINO scaffold and random head checkpoint

63089c1 verified 20 days ago

8.44 kB

	---
	license: cc-by-sa-4.0
	library_name: pytorch
	pipeline_tag: image-classification
	base_model: facebook/dinov3-vits16-pretrain-lvd1689m
	tags:
	- image-classification
	- computer-vision
	- dinov3
	- pytorch
	- safetensors
	- prototype-learning
	- hard-example-mining
	- feedback-routing
	- experimental
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	---

	# ProtoMorph-DINO

	Feedback-Gated Prototype Morphing for Hard-Case Image Classification

	ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone.

	This model card is for the Hugging Face repository:

	```text
	shiowo/DINO-Protomorph
	```

	This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are pending because the repository is being created before full training and benchmarking.

	This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project.

	---

	## Architecture

	```text
	Image
	↓
	Frozen DINOv3
	↓
	Patch map z0
	↓
	ProtoMorph block 1
	↓
	Layer Memory Attention
	↓
	ProtoMorph block 2
	↓
	Layer Memory Attention
	↓
	Main logits
	↓
	Hard-case gate
	├── easy: return main logits
	└── hard:
	feedback from top-2 probabilities
	modulate DINO patch map
	run Delta-RBF hard expert
	fuse logits
	```

	---

	## Model Summary

	ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head.

	For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits.

	The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier.

	---

	## Current Status

	Status: research scaffold / pre-training setup

	The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise.

	Predictions are not meaningful until the ProtoMorph head is trained on a real dataset.

	---

	## Results

	Evaluation results: Pending

	No benchmark results are reported yet because the repository is being prepared before training and evaluation.

	\| Metric \| Value \|
	\|---\|---:\|
	\| Accuracy \| Pending \|
	\| F1 \| Pending \|
	\| Precision \| Pending \|
	\| Recall \| Pending \|
	\| Confusion-pair improvement \| Pending \|
	\| Hard-case routing benefit \| Pending \|

	Recommended future baselines:

	\| Baseline \| Purpose \|
	\|---\|---\|
	\| DINOv3 + Linear Probe \| Minimal frozen-backbone baseline \|
	\| DINOv3 + MLP Head \| Strong simple head baseline \|
	\| CLIP + Linear Probe \| Popular vision-language comparison \|
	\| ConvNeXt \| Strong CNN-style baseline \|
	\| ViT \| Standard transformer baseline \|

	---

	## Intended Use

	This model is intended for:

	- image classification research
	- hard-example routing experiments
	- prototype learning experiments
	- frozen-backbone classifier research
	- fine-grained classification experiments
	- educational computer vision experiments

	This model is not intended for safety-critical use.

	Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation.

	---

	## Model Files

	Recommended repository layout:

	```text
	.
	├── README.md
	├── LICENSE-WEIGHTS.md
	├── config.json
	├── labels.txt
	├── checkpoints/
	│ ├── config.json
	│ ├── labels.txt
	│ └── protomorph_head.safetensors
	├── infer.py
	├── scripts/
	│ └── upload_to_hf.py
	└── src/
	└── protomorph/
	```

	The main weight file is:

	```text
	checkpoints/protomorph_head.safetensors
	```

	This file contains only the custom ProtoMorph classification head.

	DINOv3 backbone weights are not included in this repository.

	---

	## Backbone

	Default backbone:

	```text
	facebook/dinov3-vits16-pretrain-lvd1689m
	```

	The backbone is used as a frozen visual feature extractor.

	For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings.

	---

	## Installation

	Recommended environment:

	```text
	Python 3.11
	PyTorch 2.4.0
	CUDA 12.4 PyTorch wheel
	```

	Install PyTorch:

	```bash
	pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124
	```

	Install dependencies:

	```bash
	pip install -r requirements-core.txt
	```

	---

	## RunPod Environment Variables

	This project supports the RunPod environment variable names shown below:

	```text
	hf_key=hf_your_huggingface_write_token_here
	hf_repo=shiowo/DINO-Protomorph
	```

	Standard Hugging Face names are also supported:

	```text
	HF_TOKEN=hf_your_huggingface_write_token_here
	HF_REPO_ID=shiowo/DINO-Protomorph
	```

	Never commit your real Hugging Face token to the repository.

	---

	## Inference

	Run inference from the command line:

	```bash
	python infer.py \
	--image examples/sample_image.jpg \
	--config checkpoints/config.json \
	--checkpoint checkpoints/protomorph_head.safetensors \
	--labels checkpoints/labels.txt \
	--topk 5
	```

	For smoke testing only:

	```bash
	python infer.py --image examples/sample_image.jpg --allow-random-head
	```

	If the head is untrained, the output is only useful for checking that the pipeline runs.

	---

	## Upload to Hugging Face from RunPod

	After setting `hf_key` and `hf_repo` in RunPod, run:

	```bash
	cd /workspace/protomorph_dinov3_runpod
	source .venv/bin/activate
	python scripts/upload_to_hf.py
	```

	Or use the helper script:

	```bash
	bash runpod/upload_to_hf.sh
	```

	Dry run before upload:

	```bash
	python scripts/upload_to_hf.py --dry-run
	```

	---

	## Config Example

	```json
	{
	"dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
	"num_classes": 10,
	"embed_dim": 384,
	"patch_size": 16,
	"proto_count": 64,
	"memory_tokens": 16,
	"rbf_count": 128,
	"num_heads": 8,
	"dropout": 0.0,
	"hard_pmax_threshold": 0.65,
	"hard_margin_threshold": 0.15,
	"hard_entropy_threshold": 1.35,
	"image_size": 512,
	"use_bf16_autocast": true,
	"normalize_patch_tokens": true
	}
	```

	---

	## Limitations

	Known limitations:

	- The architecture is experimental.
	- Evaluation results are pending.
	- The hard-case gate requires threshold tuning.
	- The Delta-RBF hard expert may overfit small datasets.
	- Inference may be slower for hard samples.
	- The model should be compared against simple baselines before claiming improvement.
	- This repository does not include DINOv3 weights.
	- The custom head may not generalize outside the dataset it was trained on.

	---

	## License

	The ProtoMorph head weights in this repository are released under:

	```text
	Creative Commons Attribution-ShareAlike 4.0 International
	CC BY-SA 4.0
	```

	You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license.

	This license applies only to the ProtoMorph head weights and related files released in this repository.

	It does not apply to:

	- DINOv3
	- PyTorch
	- Hugging Face Transformers
	- third-party datasets
	- third-party model weights
	- upstream dependencies

	DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license.

	---

	## Attribution

	If you use this model or build on it, please credit:

	```text
	ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification
	Author: shiowo
	Repository: https://huggingface.co/shiowo/DINO-Protomorph
	```

	BibTeX:

	```bibtex
	@software{protomorph_dino_2026,
	title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification},
	author = {shiowo},
	year = {2026},
	url = {https://huggingface.co/shiowo/DINO-Protomorph}
	}
	```

	---

	## Disclaimer

	This is a research prototype.

	The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.