Qwen3-0.6B-SAT-VarSelector-Sym

A lightweight Qwen3-0.6B model fine-tuned for SAT branching variable selection in Cube-and-Conquer (CnC) solvers.

Model Description

This model predicts which variable to branch/cube on next, given a SAT CNF formula state. Instead of generating text, it outputs a classification over variable IDs (1-600).

Architecture

Base: Qwen/Qwen3-0.6B (causal language model)
Head: LayerNorm → Linear(hidden_size, 601)
Pooling: Last non-pad token hidden state
Masking: Invalid variables (not in CNF) are masked to -10000 before softmax
Size: ~1.2GB (bfloat16)

Training Details

Parameter	Value
Dataset	8,110 training / 902 validation samples
Task	Predict expert-selected branching variable
Epochs	~6 (best checkpoint at epoch 5.83)
Hardware	8×H100 GPUs
Framework	DeepSpeed ZeRO-3
Learning Rate	5e-6
Max Sequence Length	8192 tokens
Max Variables	600

Performance

Metric	Value
Top-1 Accuracy	23.95%
Eval Loss	3.30
Inference Speed	~45ms/sample (H100)

Usage

import torch
from transformers import AutoTokenizer
from sft_qwen_var_classifier import QwenVarClassifier, cnf_valid_mask

# Load model
model = QwenVarClassifier("Qwen/Qwen3-0.6B", max_vars=600)
state_dict = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(state_dict, strict=False)
model = model.to("cuda", dtype=torch.bfloat16)
model.eval()

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")

# Prepare CNF input
cnf_text = """p cnf 100 250
1 -2 3 0
-1 2 -4 0
...
"""

# Tokenize
inputs = tokenizer(cnf_text, return_tensors="pt", truncation=True, max_length=8192)
inputs = {k: v.to("cuda") for k, v in inputs.items()}

# Get valid variable mask
valid_mask = torch.tensor([cnf_valid_mask(cnf_text, max_vars=600)], dtype=torch.bool, device="cuda")

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs["logits"]
    logits = logits.masked_fill(~valid_mask, -1e4)
    predicted_var = logits.argmax(dim=-1).item()

print(f"Predicted branching variable: {predicted_var}")

Files

File	Description
`pytorch_model.bin`	Model weights (~1.2GB, bfloat16)
`sft_qwen_var_classifier.py`	Model class definition (required for loading)
`tokenizer.json`	Tokenizer configuration
`tokenizer_config.json`	Tokenizer settings

Use Cases

Cube-and-Conquer SAT Solving: Guide cube splitting decisions
Research: Study neural branching heuristics
Benchmarking: Compare against traditional heuristics (VSIDS, etc.)

Limitations

Maximum 600 variables supported
Maximum 8192 tokens for CNF input
Trained on specific CNF distribution
Best used as a starting point for further fine-tuning on your specific problem domain

Related Models

Yale-ROSE/Qwen3-0.6B-SAT-VarSelector - Earlier version

Citation

If you use this model, please cite the Transformer-CnC paper.

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Yale-ROSE/Qwen3-0.6B-SAT-VarSelector-Sym

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(852)

this model