File size: 3,574 Bytes
5fae50c 9f3ac5c 5fae50c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
---
library_name: pytorch
tags:
- contrastive-learning
- tag-classification
- semantic-search
- embeddings
- persona-conditioned
- pretrained-backbone
---
# modernbert-base-tag-classification
This is a **pretrained backbone model** (answerdotai/ModernBERT-base) used for tag classification via contrastive learning.
## Model Description
This model uses the `answerdotai/ModernBERT-base` backbone directly without fine-tuning. It's designed for zero-shot tag classification tasks where you want to use a pretrained embedding model for semantic similarity computation.
## Usage
See the README.md for detailed usage examples using our module abstractions.
## Model Architecture
- **Backbone**: `answerdotai/ModernBERT-base`
- **Type**: Pretrained backbone (no fine-tuning)
- **Embedding Dimension**: Varies by backbone model
## Usage Example
```python
"""
Example: Using ModernBERT-base for Tag Classification
This example shows how to use the pretrained ModernBERT-base backbone
for zero-shot tag classification using our module abstractions.
Installation:
pip install git+https://github.com/Pieces/TAG-module.git@main
# Or: pip install -e .
Note: ModernBERT requires Python < 3.14 due to torch.compile compatibility.
"""
import torch
from tags_model.models.backbone import SharedTextBackbone
from playground.validate_from_checkpoint import compute_ranked_tags
# Load the pretrained backbone
print("Loading ModernBERT-base...")
backbone = SharedTextBackbone(
model_name="answerdotai/ModernBERT-base",
embedding_dim=768,
freeze_backbone=True,
pooling_mode="cls",
trust_remote_code=True, # Required for ModernBERT
)
backbone.eval()
print("✓ Model loaded!")
# Example query
query_text = "Machine learning model for image classification using PyTorch"
# Candidate tags to rank
candidate_tags = [
"pytorch", "machine-learning", "deep-learning", "computer-vision",
"neural-networks", "cnn", "image-classification", "tensorflow",
"data-science", "python"
]
print(f"\nQuery: {query_text}")
print(f"Candidate tags: {candidate_tags}\n")
# Encode query and tags
with torch.inference_mode():
query_emb = backbone.encode_texts([query_text], max_length=512, return_dict=False)[0]
tag_embs = backbone.encode_texts(candidate_tags, max_length=512, return_dict=False)
print(f"Query embedding shape: {query_emb.shape}")
print(f"Tag embeddings shape: {tag_embs.shape}")
# Rank tags by similarity
ranked_tags = compute_ranked_tags(
query_emb=query_emb,
pos_embs=torch.empty(0, 768), # No positives for zero-shot
neg_embs=torch.empty(0, 768), # No negatives for zero-shot
general_embs=tag_embs,
positive_tags=[],
negative_tags=[],
general_tags=candidate_tags,
)
# Display top-ranked tags
print("\n" + "="*60)
print("Top Ranked Tags:")
print("="*60)
for tag, rank, label, score in ranked_tags[:5]:
print(f"{rank:2d}. {tag:20s} (score: {score:.4f})")
print("\n" + "="*60)
print("Example complete!")
```
### Running the Example
```bash
# Install the repository first
pip install git+https://github.com/Pieces/TAG-module.git@main
# Or for local development:
pip install -e .
# Run the example
python modernbert_example.py
```
## Citation
If you use this model, please cite:
```bibtex
@software{{tag_module,
title = {{TAG Module: Persona-Conditioned Contrastive Learning for Tag Classification}},
author = {{Your Name}},
year = {{2025}},
url = {{https://github.com/yourusername/tag-module}}
}}
```
## License
Please refer to the original model license for the backbone model.
|