Qwen3-0.6B-SAT-VarSelector-Sym
A lightweight Qwen3-0.6B model fine-tuned for SAT branching variable selection in Cube-and-Conquer (CnC) solvers.
Model Description
This model predicts which variable to branch/cube on next, given a SAT CNF formula state. Instead of generating text, it outputs a classification over variable IDs (1-600).
Architecture
- Base:
Qwen/Qwen3-0.6B(causal language model) - Head: LayerNorm → Linear(hidden_size, 601)
- Pooling: Last non-pad token hidden state
- Masking: Invalid variables (not in CNF) are masked to -10000 before softmax
- Size: ~1.2GB (bfloat16)
Training Details
| Parameter | Value |
|---|---|
| Dataset | 8,110 training / 902 validation samples |
| Task | Predict expert-selected branching variable |
| Epochs | ~6 (best checkpoint at epoch 5.83) |
| Hardware | 8×H100 GPUs |
| Framework | DeepSpeed ZeRO-3 |
| Learning Rate | 5e-6 |
| Max Sequence Length | 8192 tokens |
| Max Variables | 600 |
Performance
| Metric | Value |
|---|---|
| Top-1 Accuracy | 23.95% |
| Eval Loss | 3.30 |
| Inference Speed | ~45ms/sample (H100) |
Usage
import torch
from transformers import AutoTokenizer
from sft_qwen_var_classifier import QwenVarClassifier, cnf_valid_mask
# Load model
model = QwenVarClassifier("Qwen/Qwen3-0.6B", max_vars=600)
state_dict = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(state_dict, strict=False)
model = model.to("cuda", dtype=torch.bfloat16)
model.eval()
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")
# Prepare CNF input
cnf_text = """p cnf 100 250
1 -2 3 0
-1 2 -4 0
...
"""
# Tokenize
inputs = tokenizer(cnf_text, return_tensors="pt", truncation=True, max_length=8192)
inputs = {k: v.to("cuda") for k, v in inputs.items()}
# Get valid variable mask
valid_mask = torch.tensor([cnf_valid_mask(cnf_text, max_vars=600)], dtype=torch.bool, device="cuda")
# Predict
with torch.no_grad():
outputs = model(**inputs)
logits = outputs["logits"]
logits = logits.masked_fill(~valid_mask, -1e4)
predicted_var = logits.argmax(dim=-1).item()
print(f"Predicted branching variable: {predicted_var}")
Files
| File | Description |
|---|---|
pytorch_model.bin |
Model weights (~1.2GB, bfloat16) |
sft_qwen_var_classifier.py |
Model class definition (required for loading) |
tokenizer.json |
Tokenizer configuration |
tokenizer_config.json |
Tokenizer settings |
Use Cases
- Cube-and-Conquer SAT Solving: Guide cube splitting decisions
- Research: Study neural branching heuristics
- Benchmarking: Compare against traditional heuristics (VSIDS, etc.)
Limitations
- Maximum 600 variables supported
- Maximum 8192 tokens for CNF input
- Trained on specific CNF distribution
- Best used as a starting point for further fine-tuning on your specific problem domain
Related Models
- Yale-ROSE/Qwen3-0.6B-SAT-VarSelector - Earlier version
Citation
If you use this model, please cite the Transformer-CnC paper.
License
Apache 2.0