--- license: cc-by-sa-4.0 library_name: pytorch pipeline_tag: image-classification base_model: facebook/dinov3-vits16-pretrain-lvd1689m tags: - image-classification - computer-vision - dinov3 - pytorch - safetensors - prototype-learning - hard-example-mining - feedback-routing - experimental metrics: - accuracy - f1 - precision - recall --- # ProtoMorph-DINO **Feedback-Gated Prototype Morphing for Hard-Case Image Classification** ProtoMorph-DINO is an experimental image classification head designed to run on top of a frozen DINOv3 vision backbone. This model card is for the Hugging Face repository: ```text shiowo/DINO-Protomorph ``` This repository currently contains an initial research scaffold and custom ProtoMorph head checkpoint. Evaluation results are **pending** because the repository is being created before full training and benchmarking. This project is independent and is not affiliated with Meta AI, Hugging Face, or the official DINOv3 project. --- ## Architecture ```text Image ↓ Frozen DINOv3 ↓ Patch map z0 ↓ ProtoMorph block 1 ↓ Layer Memory Attention ↓ ProtoMorph block 2 ↓ Layer Memory Attention ↓ Main logits ↓ Hard-case gate ├── easy: return main logits └── hard: feedback from top-2 probabilities modulate DINO patch map run Delta-RBF hard expert fuse logits ``` --- ## Model Summary ProtoMorph-DINO explores whether a frozen foundation vision backbone can be improved with a custom hard-case refinement head. For easy images, the model returns the main classifier output directly. For difficult or ambiguous images, the model activates a feedback branch. The feedback branch uses the top-2 predicted probabilities to modulate the DINO patch map, sends the modified representation through a Delta-RBF hard expert, and fuses the refined logits with the main logits. The main research question is whether feedback-guided hard-case refinement can improve classification performance over simpler frozen-backbone heads such as a linear probe or MLP classifier. --- ## Current Status **Status: research scaffold / pre-training setup** The current checkpoint may be randomly initialized or only intended for smoke testing unless a later release says otherwise. Predictions are **not meaningful** until the ProtoMorph head is trained on a real dataset. --- ## Results **Evaluation results: Pending** No benchmark results are reported yet because the repository is being prepared before training and evaluation. | Metric | Value | |---|---:| | Accuracy | Pending | | F1 | Pending | | Precision | Pending | | Recall | Pending | | Confusion-pair improvement | Pending | | Hard-case routing benefit | Pending | Recommended future baselines: | Baseline | Purpose | |---|---| | DINOv3 + Linear Probe | Minimal frozen-backbone baseline | | DINOv3 + MLP Head | Strong simple head baseline | | CLIP + Linear Probe | Popular vision-language comparison | | ConvNeXt | Strong CNN-style baseline | | ViT | Standard transformer baseline | --- ## Intended Use This model is intended for: - image classification research - hard-example routing experiments - prototype learning experiments - frozen-backbone classifier research - fine-grained classification experiments - educational computer vision experiments This model is **not** intended for safety-critical use. Do not use this model for medical, legal, financial, biometric, security-critical, or production decisions without independent validation. --- ## Model Files Recommended repository layout: ```text . ├── README.md ├── LICENSE-WEIGHTS.md ├── config.json ├── labels.txt ├── checkpoints/ │ ├── config.json │ ├── labels.txt │ └── protomorph_head.safetensors ├── infer.py ├── scripts/ │ └── upload_to_hf.py └── src/ └── protomorph/ ``` The main weight file is: ```text checkpoints/protomorph_head.safetensors ``` This file contains only the custom ProtoMorph classification head. DINOv3 backbone weights are **not** included in this repository. --- ## Backbone Default backbone: ```text facebook/dinov3-vits16-pretrain-lvd1689m ``` The backbone is used as a frozen visual feature extractor. For RTX 3090-class GPUs, ViT-S/16 is a practical starting point because it keeps VRAM usage manageable while still producing useful patch embeddings. --- ## Installation Recommended environment: ```text Python 3.11 PyTorch 2.4.0 CUDA 12.4 PyTorch wheel ``` Install PyTorch: ```bash pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu124 ``` Install dependencies: ```bash pip install -r requirements-core.txt ``` --- ## RunPod Environment Variables This project supports the RunPod environment variable names shown below: ```text hf_key=hf_your_huggingface_write_token_here hf_repo=shiowo/DINO-Protomorph ``` Standard Hugging Face names are also supported: ```text HF_TOKEN=hf_your_huggingface_write_token_here HF_REPO_ID=shiowo/DINO-Protomorph ``` Never commit your real Hugging Face token to the repository. --- ## Inference Run inference from the command line: ```bash python infer.py \ --image examples/sample_image.jpg \ --config checkpoints/config.json \ --checkpoint checkpoints/protomorph_head.safetensors \ --labels checkpoints/labels.txt \ --topk 5 ``` For smoke testing only: ```bash python infer.py --image examples/sample_image.jpg --allow-random-head ``` If the head is untrained, the output is only useful for checking that the pipeline runs. --- ## Upload to Hugging Face from RunPod After setting `hf_key` and `hf_repo` in RunPod, run: ```bash cd /workspace/protomorph_dinov3_runpod source .venv/bin/activate python scripts/upload_to_hf.py ``` Or use the helper script: ```bash bash runpod/upload_to_hf.sh ``` Dry run before upload: ```bash python scripts/upload_to_hf.py --dry-run ``` --- ## Config Example ```json { "dino_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m", "num_classes": 10, "embed_dim": 384, "patch_size": 16, "proto_count": 64, "memory_tokens": 16, "rbf_count": 128, "num_heads": 8, "dropout": 0.0, "hard_pmax_threshold": 0.65, "hard_margin_threshold": 0.15, "hard_entropy_threshold": 1.35, "image_size": 512, "use_bf16_autocast": true, "normalize_patch_tokens": true } ``` --- ## Limitations Known limitations: - The architecture is experimental. - Evaluation results are pending. - The hard-case gate requires threshold tuning. - The Delta-RBF hard expert may overfit small datasets. - Inference may be slower for hard samples. - The model should be compared against simple baselines before claiming improvement. - This repository does not include DINOv3 weights. - The custom head may not generalize outside the dataset it was trained on. --- ## License The ProtoMorph head weights in this repository are released under: ```text Creative Commons Attribution-ShareAlike 4.0 International CC BY-SA 4.0 ``` You may use, share, and adapt these weights, including commercially, provided that you give appropriate credit and distribute adapted versions under CC BY-SA 4.0 or a compatible license. This license applies only to the ProtoMorph head weights and related files released in this repository. It does not apply to: - DINOv3 - PyTorch - Hugging Face Transformers - third-party datasets - third-party model weights - upstream dependencies DINOv3 is not redistributed in this repository. Users are responsible for obtaining DINOv3 separately and complying with its license. --- ## Attribution If you use this model or build on it, please credit: ```text ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification Author: shiowo Repository: https://huggingface.co/shiowo/DINO-Protomorph ``` BibTeX: ```bibtex @software{protomorph_dino_2026, title = {ProtoMorph-DINO: Feedback-Gated Prototype Morphing for Hard-Case Image Classification}, author = {shiowo}, year = {2026}, url = {https://huggingface.co/shiowo/DINO-Protomorph} } ``` --- ## Disclaimer This is a research prototype. The model is provided for experimentation and educational use. It should not be used in production or high-stakes environments without independent validation, dataset auditing, robustness testing, and bias evaluation.