File size: 3,317 Bytes
80c278b
 
 
 
 
 
 
26b4c66
 
 
 
 
 
 
 
 
80c278b
 
26b4c66
80c278b
26b4c66
80c278b
26b4c66
80c278b
26b4c66
80c278b
26b4c66
 
 
 
80c278b
26b4c66
 
 
 
80c278b
26b4c66
 
 
 
 
 
 
 
80c278b
26b4c66
80c278b
26b4c66
 
 
 
 
 
 
 
 
 
 
 
80c278b
26b4c66
80c278b
26b4c66
80c278b
26b4c66
80c278b
26b4c66
80c278b
26b4c66
 
 
 
80c278b
26b4c66
80c278b
26b4c66
 
 
80c278b
26b4c66
80c278b
26b4c66
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
base_model: Qwen/Qwen3.5-2B
pipeline_tag: text-generation
tags:
- base_model:adapter:Qwen/Qwen3.5-2B
- lora
- transformers
- lean4
- theorem-proving
- tactic-prediction
- formal-verification
- mathlib
- peft
license: mit
datasets:
- markm39/openproof-tactic-pairs
---

# OpenProof Tactic 2B

A Lean 4 tactic prediction model fine-tuned from [Qwen/Qwen3.5-2B](https://huggingface.co/Qwen/Qwen3.5-2B) on 190K+ Mathlib tactic pairs.

Given a Lean 4 proof goal state, the model predicts the next tactic to apply. Designed for use in best-first search (BFS) proof engines like [OpenProof](https://github.com/mxthematic/openproof).

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3.5-2B"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="cuda")
model = PeftModel.from_pretrained(model, "markm39/openproof-tactic-2b")

# Input: goal state followed by :::
prompt = "n m : Nat\\n⊢ n + m = m + n:::"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=64, do_sample=True, temperature=0.6, top_k=10)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
# Output: induction m with | zero => simp | succ m ih => simp [Nat.add_assoc, ih, Nat.add_comm]
```

## Training

- **Base model**: Qwen/Qwen3.5-2B
- **Method**: SFT with LoRA (rank 64, alpha 128)
- **Trainable params**: 43.6M / 1.93B (2.3%)
- **Data**: [markm39/openproof-tactic-pairs](https://huggingface.co/datasets/markm39/openproof-tactic-pairs) -- 190K+ Lean 4 goal-tactic pairs extracted from Mathlib
- **Input format**: `{goal_state}:::`
- **Output format**: tactic string
- **Epochs**: ~1.8 (12,500 steps)
- **Batch size**: effective 128 (2 per device x 32 gradient accumulation)
- **Learning rate**: 2e-5 cosine schedule
- **Best eval loss**: 0.5492 (checkpoint-12500)
- **Hardware**: RunPod GPU instance
- **Framework**: HuggingFace TRL + PEFT

## Intended Use

Tactic prediction for automated theorem proving in Lean 4. The model generates candidate tactics given a proof goal state, which are then verified by the Lean type checker. Designed to be used with beam search -- sample multiple candidates and verify each.

Not intended for general-purpose text generation.

## Limitations

- Trained on Mathlib tactics only -- may not generalize to custom Lean libraries
- LoRA adapter (must be loaded on top of Qwen3.5-2B base)
- SFT only (no RL/GRPO refinement yet)
- May hallucinate lemma names that don't exist in Mathlib

## Links

- [OpenProof](https://github.com/mxthematic/openproof) -- the theorem prover that uses this model
- [Training code](https://github.com/mxthematic/openproof-ml)
- [Training data](https://huggingface.co/datasets/markm39/openproof-tactic-pairs)

## Acknowledgments

Training data derived from [LeanDojo](https://leandojo.org/) (Yang et al.), [Lean Workbook](https://huggingface.co/datasets/internlm/Lean-Workbook) (InternLM), and [Goedel-Pset](https://huggingface.co/datasets/Goedel-LM/Goedel-Pset) (Goedel-LM). Tactic extraction from whole proofs uses [Pantograph](https://github.com/leanprover/Pantograph) kernel-level invocations.