Iteration 15: full-pipeline test β C(2p,p) β‘ 2 (mod p) verified 4a5b19f Running Vilin97 Claude Opus 4.6 (1M context) commited on about 2 hours ago
Add repair_lean_proofs tool for instant sorry-filling f3ef14f Vilin97 Claude Opus 4.6 (1M context) commited on about 5 hours ago
Update landing page examples to showcase verified capabilities 0253f58 Vilin97 Claude Opus 4.6 (1M context) commited on about 8 hours ago
Iteration 12: reliability stress test β 22/30 verified 177cf65 Vilin97 Claude Opus 4.6 (1M context) commited on about 11 hours ago
Add extract_sorry_lemmas tool for automated decomposition 7879db5 Vilin97 Claude Opus 4.6 (1M context) commited on about 14 hours ago
Add KaTeX + markdown rendering to status page 5e2d2e6 Vilin97 Claude Opus 4.6 (1M context) commited on about 17 hours ago
Rate-limit Aristotle checks, add stuck detection for repeated errors 19f0d9f Vilin97 Claude Opus 4.6 (1M context) commited on about 20 hours ago
Iteration 8: boundary testing validates system capability 0534a89 Vilin97 Claude Opus 4.6 (1M context) commited on about 23 hours ago
Improve rejection emails, validate vacuous proof detection 15e13fe Vilin97 Claude Opus 4.6 (1M context) commited on 1 day ago
Add programmatic vacuous proof detection before LLM self-review 67992f3 Vilin97 Claude Opus 4.6 (1M context) commited on 1 day ago
Iteration 5: validation + HF deployment with all improvements 7e9a472 Vilin97 Claude Opus 4.6 (1M context) commited on 1 day ago
Submit to Aristotle early, add Lean error categorization 33e31b8 Vilin97 Claude Opus 4.6 (1M context) commited on 1 day ago
Strict self-review: extract theorem statements, ignore comments 60985bd Vilin97 Claude Opus 4.6 (1M context) commited on 1 day ago
Remove blocking wait_for_aristotle β 9x proof throughput improvement 2a8c189 Vilin97 Claude Opus 4.6 (1M context) commited on 2 days ago
Add hard context reset on LLM errors, first iteration log 8a27a76 Vilin97 Claude Opus 4.6 (1M context) commited on 2 days ago
Add self-review gate, increase iteration limit, prefer Lean-with-sorry for Aristotle 5c59a5f Vilin97 Claude Opus 4.6 (1M context) commited on 2 days ago
Refactor to FastAPI + background job architecture 1b86fa5 Vilin97 Claude Opus 4.6 (1M context) commited on 4 days ago
Remove Qwen 3.5, fix sorry/theorem validation, markdown emails c262c9c Vilin97 Claude Opus 4.6 (1M context) commited on 4 days ago
Fix sorry detection, explanation generation, add ISSUES.md 0a40264 Vilin97 Claude Opus 4.6 (1M context) commited on 4 days ago
12/12 Putnam 2025 verified β complete sweep, $1.13 total 759bac6 Vilin97 Claude Opus 4.6 (1M context) commited on 5 days ago
9/9 Putnam 2025 problems verified! Full test log update a6cab41 Vilin97 Claude Opus 4.6 (1M context) commited on 5 days ago
Auto-finalize on verified proof β pq-group solved in 3m23s d065f0b Vilin97 Claude Opus 4.6 (1M context) commited on 5 days ago
Add stats to email: time, cost, tokens, tool call counts 05d1842 Vilin97 Claude Opus 4.6 (1M context) commited on 5 days ago
Email now includes proof.lean and research_log.md attachments 1568228 Vilin97 Claude Opus 4.6 (1M context) commited on 5 days ago
Update LOG with pq-group 20min active-proving result d95748d Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Fix examples format, strengthen no-idle prompt, fix email deploy f14dc06 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Add email notification for long-running proofs 033cbc7 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Add context compression to prevent LLM 400 errors 1a40616 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Active proving: Kimi + Qwen race against Aristotle 81ad344 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Update LOG: pq-group fast path verified but agent didn't finalize 6da83ba Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Update LOG.md with UW analysis prelim results and new tool tests ab10662 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Add Loogle search and Qwen 3.5 Lean prover d6a0482 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Update LOG.md with FATE-X #10 test, update README lean version to v4.28 7247599 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Improve false-statement handling, increase Aristotle timeout to 2h, better visibility 4093e2c Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Update LOG.md with comprehensive test results (20 queries) 5dc7261 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago
Add VeriDeepResearch: verified math research chatbot 925bbe2 Vilin97 Claude Opus 4.6 (1M context) commited on 6 days ago