Text-to-YOLO-Weights Hypernetwork

A hypernetwork that takes a text description of a computer vision detection task and generates YOLO detector weights (LoRA-style adapters) in a single forward pass.

Architecture

Based on Drag-and-Drop LLMs (DnD) and Neural Network Diffusion (p-diff):

  • Text Encoder: Frozen sentence-transformers/all-MiniLM-L6-v2 (384-dim)
  • Hyper-Convolutional Decoder: Cascaded 1D conv blocks mapping text embeddings β†’ weight vectors
  • LoRA Adapter Generation: Outputs low-rank A/B matrices for YOLOv8 detection head layers

How It Works

Text Description (e.g. "Detect license plates in traffic images")
    ↓
Sentence-BERT Encoder β†’ 384-dim embedding
    ↓
Hyper-Convolutional Decoder (cascaded 1D conv blocks)
    ↓
Flattened Weight Vector
    ↓
Reshape into LoRA A/B matrices per detection head layer
    ↓
Apply to frozen YOLOv8-n backbone

Files

  • text_to_yolo_hypernet.py β€” Core architecture (encoder + decoder)
  • train_hypernet.py β€” Full training loop with noise augmentation
  • synthetic_data_generator.py β€” Generate synthetic training data
  • prepare_dataset.py β€” Prepare data from HF Hub fine-tuned YOLO models
  • inference.py β€” Text prompt β†’ weights β†’ YOLO inference

Training

# Generate synthetic training data
python synthetic_data_generator.py --num_samples 500 --perturbation_scale 0.05

# Train hypernetwork
python train_hypernet.py --epochs 200 --batch_size 8 --lr 1e-4

Inference

from text_to_yolo_hypernet import Config, TextEncoder, HyperWeightDecoder

config = Config()
encoder = TextEncoder(config.text_encoder_model)
decoder = HyperWeightDecoder(config, layer_shapes)

adapters = generate_yolo_weights("Detect cars and pedestrians", decoder, encoder, config)

Research Background

License

AGPL-3.0 (same as Ultralytics YOLO)

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "mabbam/text-to-yolo-weights-hypernet"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support