modern-BERT-MTG-Commander
A fine-tune of modern-BERT trained in a Siamese Network (Bi-Encoder) architecture on card data to predict how likely it is to be included in a Commander deck. This models an embedding space that has a much richer semantic understanding of Magic cards than the base model
Before
Base model neighbors for 'Yuriko, the Tiger's Shadow':
- Nashi, Moon Sage's Scion (cos=0.9907)
- Fallen Shinobi (cos=0.9889)
- A-Nashi, Moon Sage's Scion (cos=0.9889)
- Walker of Secret Ways (cos=0.9884)
- Orochi Soul-Reaver (cos=0.9883)
- A-Dokuchi Silencer (cos=0.9875)
- Higure, the Still Wind (cos=0.9874)
- Prosperous Thief (cos=0.9870)
- Ninja of the Deep Hours (cos=0.9869)
- Zareth San, the Trickster (cos=0.9868)
(high density, low relevance)
After
Fine-tuned neighbors for 'Yuriko, the Tiger's Shadow':
- Ingenious Infiltrator (cos=0.7685)
- Higure, the Still Wind (cos=0.7446)
- Moon-Circuit Hacker (cos=0.7341)
- Dokuchi Silencer (cos=0.7043)
- A-Dokuchi Silencer (cos=0.7041)
- Covert Technician (cos=0.7028)
- Sakashima's Student (cos=0.6982)
- Ninja of the Deep Hours (cos=0.6972)
- A-Prosperous Thief (cos=0.6883)
- Azra Smokeshaper (cos=0.6842)
(low density, high relevance)
Usage
import torch
import torch.nn.functional as F
from transformers import AutoModel, AutoTokenizer
def mean_pool(model_output, attention_mask):
token_embeddings = model_output.last_hidden_state
mask = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
summed = torch.sum(token_embeddings * mask, dim=1)
counts = torch.clamp(mask.sum(dim=1), min=1e-9)
return summed / counts
tokenizer = AutoTokenizer.from_pretrained("nishtahir/modern-BERT-MTG-Commander")
model = AutoModel.from_pretrained(model_name).eval()
# Cards are serialized in the Scryfall Text Format
card_data = """Commit {3}{U}
Instant
Put target spell or nonland permanent into its owner's library second from the top.
----
Memory {4}{U}{U}
Sorcery
Aftermath (Cast this spell only from your graveyard. Then exile it.)
Each player shuffles their hand and graveyard into their library, then draws seven cards.
"""
encoded = tokenizer([card_data],padding=True, truncation=True, return_tensors="pt")
ids = encoded["input_ids"].to(device)
mask = encoded["attention_mask"].to(device)
with torch.no_grad():
outputs = model(ids, mask)
# Mean pool to get a single embedding vector
pooled = mean_pool(outputs, mask)
# Normalize
embeddings = F.normalize(torch.cat(all_vecs, dim=0), p=2, dim=1)
- Downloads last month
- 4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for nishtahir/modern-BERT-MTG-Commander
Base model
answerdotai/ModernBERT-base
