--- license: apache-2.0 library_name: transformers base_model: Qwen/Qwen2.5-7B-Instruct pipeline_tag: text-generation tags: - distillation - reasoning-trace-extraction - openthoughts - qwen3 - victim-model datasets: - Chia-Mu-Lab/openthoughts_distill_victim_data_10k_clean_swag_qwen3_32b --- # ot-q3_32b-clean Qwen2.5-7B-Instruct **student** model fine-tuned by full-parameter SFT (s1 recipe) on **Qwen3-32B (OpenThoughts SWAG, V3-attack cleaned)** reasoning traces. This repo is part of a 4-victim study comparing student distillation outcomes when the teacher's reasoning traces are extracted via the V3 attack (`-orig`) vs. when the V3 attack wrapper is stripped before training (`-clean`). ## How to load a specific epoch Each `epoch_N/` subfolder is a self-contained, loadable HF checkpoint. ```python from transformers import AutoModelForCausalLM, AutoTokenizer REPO = "Chia-Mu-Lab/ot-q3_32b-clean" model = AutoModelForCausalLM.from_pretrained(REPO, subfolder="epoch_5", torch_dtype="bfloat16") tok = AutoTokenizer.from_pretrained(REPO, subfolder="epoch_5") ``` ## Per-epoch evaluation All numbers are accuracies in percent on the canonical eval suite (GSM8K-MATH500, AIME24, AIME25, JEEBench Math subset strict/partial, LiveCodeBench v5 pass@1). The `base` row is the Qwen2.5-7B-Instruct starting point, evaluated identically. **Bold** values across this row indicate per-victim peaks. | Epoch | Ckpt | MATH500 | AIME24 | AIME25 | JEE Math (s/p) | LCB pass@1 | |---:|:---|---:|---:|---:|---:|---:| | 0 | base (Qwen2.5-7B-Instruct) | 71.0 | 8.9 | 2.2 | 32.2 / 35.9 | 15.8 | | 1 | step-00619 | 54.2 | 2.2 | 8.9 | 23.2 / 27.2 | 18.3 | | 2 | step-01239 | 59.8 | 8.9 | 6.7 | 25.2 / 29.9 | 17.9 | | 3 | step-01858 | 69.5 | 12.2 | 15.6 | 35.1 / 38.9 | 16.1 | | 4 | step-02478 | 72.8 | 14.4 | 17.8 | 36.4 / 41.1 | 15.8 | | 5 | step-03095 | 74.9 | 12.2 | 15.6 | 39.3 / 43.6 | 14.7 | ## Training recipe * Base model: **Qwen/Qwen2.5-7B-Instruct** * Teacher traces: `Chia-Mu-Lab/openthoughts_distill_victim_data_10k_clean_swag_qwen3_32b` * Recipe: s1 paper exact full fine-tune (FSDP full-shard, no LoRA) * Block size: **32768** tokens · effective batch **16** (mb=1, ga=4, 4×H200) * Optimizer: AdamW, lr=1e-5 cosine, warmup_ratio=0.05, bf16 * Epochs: **5**, `save_strategy=epoch` ## Files ``` ot-q3_32b-clean/ README.md metrics.csv ← machine-readable per-epoch metric table epoch_1/ ← full HF checkpoint dir (config.json, model-*.safetensors, epoch_2/ tokenizer*, etc.) epoch_3/ epoch_4/ epoch_5/ ``` ## Caveats / known issues * All epochs here are from the canonical s1-distill 3-exp sweep (2026-05-20), evaluated with the unified math500+AIME+JEE+LCB scorer pipeline. * JEE Math here refers to the subject="math" subset (≈236 of 515 questions) scored per the official `dair-iitd` `compute_metrics.py`. The strict number is the headline accuracy; partial gives MCQ(multiple) partial credit. * These models are research artifacts for the steel-reasoning-trace project (reasoning-trace extraction attack study); do not use for production.