Instructions to use HaadesX/Iconoclast with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HaadesX/Iconoclast with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("HaadesX/Iconoclast", dtype="auto") - Notebooks
- Google Colab
- Kaggle
ICONOCLAST β iLabs Cluster Handover & Session State
Last updated: 2026-04-22 09:10 EDT Session objective: Scale ICONOCLAST benchmarks to 11+ open-source models for publishable results
1. What Is Running Right Now
Active SLURM Jobs (as of 22:05 EDT)
| Job ID | Model / Script | Status | Node | Notes |
|---|---|---|---|---|
| 130619 | Qwen 3.5-9B (Base) | RUNNING | rlab2 | Re-running with set_submodule patch. |
| 130618 | Qwen 2.5-3B (Base) | RUNNING | rlab7 | Continuing from earlier. |
| 130620 | ICONOCLAST Sequential Sweep | RUNNING | ilab2 | Runs 7 models one-by-one. |
| 130621 | HERETIC Sequential Sweep | PENDING | (Dep) | Waits for 130620 to finish. Runs baselines. |
| 130640 | Large-N Evaluator Sweep | PENDING | (Dep) | Waits for 130621 to finish. Evaluates best parameters on 520 prompts. |
Sequential Runner (Job 130620) β Models in Order
This job runs 7 models one after another, cleaning up disk cache between each:
google/gemma-2-2b-itβ run namegemma2-2b-seqmistralai/Mistral-7B-Instruct-v0.3β run namemistral-7b-seqmicrosoft/Phi-4-mini-instructβ run namephi4-mini-seqstabilityai/stablelm-2-zephyr-1_6bβ run namestablelm2-1p6b-seq01-ai/Yi-1.5-9B-Chatβ run nameyi-1p5-9b-seqtiiuae/Falcon3-7B-Instructβ run namefalcon3-7b-seqallenai/OLMo-2-0425-1B-Instructβ run nameolmo2-1b-seq
Log file: ~/iconoclast/logs/iconoclast-seq-130620.out
2. Completed Results (from prior sessions)
These models already have batch_summary.json files on the cluster:
| Model | Run Name | Checkpoint Dir | Verdict |
|---|---|---|---|
| Qwen3-1.7B | qwen3-1p7b-rutgers-paper-directness |
/common/users/vp752/iconoclast_ilabs/checkpoints/qwen3-1p7b-rutgers-paper-directness/ |
ICONOCLAST |
| Qwen2.5-3B-Instruct | qwen2-5-3b-rutgers-benchmark |
/common/users/vp752/iconoclast_ilabs/checkpoints/qwen2-5-3b-rutgers-benchmark/ |
ICONOCLAST |
| Qwen3-4B-Instruct | qwen3-4b-rutgers-benchmark-v2 |
/common/users/vp752/iconoclast_ilabs/checkpoints/qwen3-4b-rutgers-benchmark-v2/ |
ICONOCLAST |
| Phi-3.5-mini-instruct | phi35-mini-rutgers-nullspace-benchmark-v3 |
/common/users/vp752/iconoclast_ilabs/checkpoints/phi35-mini-rutgers-nullspace-benchmark-v3/ |
ICONOCLAST |
Current Scorecard (4-0 from prior sessions)
| Model | ICONOCLAST Refusals | ICONOCLAST Overrefusals | ICONOCLAST KL | HERETIC Refusals | HERETIC Overrefusals | HERETIC KL | Verdict |
|---|---|---|---|---|---|---|---|
| Qwen3-1.7B | 0/48 | 0/48 | 0.0310 | 3/48 | 0/48 | 0.0332 | ICONOCLAST |
| Qwen2.5-3B | 2/20 | 1/64 | 0.0943 | 2/20 | 1/64 | 0.3257 | ICONOCLAST |
| Qwen3-4B | 2/20 | 0/64 | 0.7976 | 3/20 | 1/64 | 0.0996 | ICONOCLAST |
| Phi-3.5-mini | 3/20 | 2/64 | 0.0981 | 7/20 | 2/64 | 0.2492 | ICONOCLAST |
3. Pending Results (waiting for jobs to finish)
Once the running jobs complete, their results will appear as batch_summary.json files in:
/common/users/vp752/iconoclast_ilabs/checkpoints/<run-name>/batch_summary.json
Models pending results:
| Model | Run Name | Quant | Expected Checkpoint |
|---|---|---|---|
| Llama-3.1-8B-Instruct | llama3-1-8b-rutgers-benchmark |
bnb_4bit | checkpoints/llama3-1-8b-rutgers-benchmark/ |
| SmolLM2-1.7B-Instruct | smollm2-1p7b-rutgers-benchmark |
none | checkpoints/smollm2-1p7b-rutgers-benchmark/ |
| Gemma-2-2B-IT | gemma2-2b-seq |
none | checkpoints/gemma2-2b-seq/ |
| Mistral-7B-Instruct-v0.3 | mistral-7b-seq |
bnb_4bit | checkpoints/mistral-7b-seq/ |
| Phi-4-mini-instruct | phi4-mini-seq |
none | checkpoints/phi4-mini-seq/ |
| StableLM-2-Zephyr-1.6B | stablelm2-1p6b-seq |
none | checkpoints/stablelm2-1p6b-seq/ |
| Yi-1.5-9B-Chat | yi-1p5-9b-seq |
bnb_4bit | checkpoints/yi-1p5-9b-seq/ |
| Falcon3-7B-Instruct | falcon3-7b-seq |
bnb_4bit | checkpoints/falcon3-7b-seq/ |
| OLMo-2-1B-Instruct | olmo2-1b-seq |
none | checkpoints/olmo2-1b-seq/ |
4. How to Check Status
SSH into the cluster
ssh vp752@ilab.cs.rutgers.edu
Check running jobs
squeue -u vp752
Check job history (completed/failed)
sacct -u vp752 --starttime=2026-04-21 --format=JobID%10,JobName%15,State%12,ExitCode,Elapsed%10
Tail the sequential runner log
tail -f ~/iconoclast/logs/iconoclast-seq-130468.out
Tail a specific job's log
tail -f ~/iconoclast/logs/iconoclast-<JOBID>.out
tail -f ~/iconoclast/logs/iconoclast-<JOBID>.err
List all batch_summary.json files (completed benchmarks)
find /common/users/vp752/iconoclast_ilabs/checkpoints/ -name batch_summary.json
Generate the comparison table (once results exist)
python3 ~/iconoclast/scripts/summarize_multimodel_benchmark.py \
--spec "ModelName|/path/to/iconoclast/batch_summary.json|/path/to/heretic/batch_summary.json"
5. Known Issues & Fixes Applied
Disk Quota
- Problem: Concurrent jobs all downloading models simultaneously blow the per-user quota on
/common/users/vp752/. - Fix: Created
scripts/run_sequential_benchmark.slurmwhich runs models one-at-a-time andrm -rfthe cache between each. - Key: Never run more than ~2 model downloads concurrently.
Transformers Version
- Upgraded to
transformers==5.5.4(from 4.57.6) to supportqwen3_5architecture. - Also upgraded
huggingface_hub==1.11.0,tokenizers==0.22.2, plus new depstyper,annotated-doc,shellingham,click. - Installed via
--no-depsto avoid pulling in a new PyTorch/CUDA stack that would blow disk quota. - Risk: The new transformers v5 may have breaking changes for some older model architectures. If a model fails with
Failed to load model with all configured dtypes, check if it's an architecture compatibility issue.
Quantization
- Only
"none"and"bnb_4bit"are supported by ICONOCLAST's config validator. - Models >4B params need
bnb_4bitto fit on RTX A4000 (16GB) / A5000 (22GB). bitsandbytesis installed in the site-packages.
HF Token
HF_TOKENis set inscripts/run_rutgers_ilabs.slurm(line 51) and in the sequential runner.- Required for gated repos like
meta-llama/Llama-3.1-8B-Instructandgoogle/gemma-2-2b-it.
Qwen2.5-3B (Base) β Job 130448
- Ran for 31 minutes, produced trial data, but crashed with
AssertionError: Should not reach.in Optuna. - The Optuna study DB may have partial results. Check if
batch_summary.jsonwas written before crash. - Last observed metrics: KL=0.0408, Refusals=1/20, Overrefusals=3/64 (excellent).
Qwen3.5-9B (Base) & Mistral-7B
- Problem:
transformers v5.5.4removed/changed internal methods, causing'Qwen3_5ForConditionalGeneration' object has no attribute 'set_submodule'and similar errors for Mistral. - Fix: Applied a monkey-patch to
torch.nn.Moduleinsrc/iconoclast/model.pythat injectsset_submoduleif missing. - Status: Qwen 3.5-9B is currently re-running as Job 130619.
Gemma-2-2B (Chat Template)
- Problem: Gemma 2 chat template does not support the "system" role, causing crashes during evaluation.
- Fix: Updated
Model.generateinsrc/iconoclast/model.pyto automatically merge system prompts into the first user message if the chat template fails. - Status: Currently being retried in the sequential runner (Job 130620).
6. Key File Locations
Local (your Mac)
/Volumes/Auxilary/Side_Projects/NLP_PROJECT_NEW/iconoclast/
βββ PUBLISHABLE_RESULTS.md # Draft paper with results table
βββ HANDOVER_ILABS.md # This file
βββ config.*.benchmark.rutgers.toml # All model configs
βββ scripts/
β βββ run_rutgers_ilabs.slurm # Single-model SLURM script
β βββ run_sequential_benchmark.slurm # Multi-model sequential runner
β βββ setup_rutgers_env.sh # Environment bootstrap
β βββ sync_to_rutgers.sh # rsync to cluster
β βββ summarize_multimodel_benchmark.py # Results aggregator
β βββ bootstrap_and_submit_rutgers_*.sh # Per-model submit scripts
βββ src/iconoclast/
βββ main.py # Core pipeline (Optuna objective, ablation loop)
βββ direction.py # Null-space projection (dampening factor)
βββ model.py # Model loading & weight editing
Remote (iLabs cluster)
/common/home/vp752/iconoclast/ # Project source (synced from local)
/common/users/vp752/iconoclast_ilabs/ # Persistent storage root
βββ bootstrap-venv/ # Python venv for pip
βββ python312-site/ # All pip packages (transformers, optuna, etc.)
βββ checkpoints/ # Optuna study DBs + batch_summary.json
β βββ qwen3-1p7b-rutgers-paper-directness/
β βββ qwen2-5-3b-rutgers-benchmark/
β βββ qwen3-4b-rutgers-benchmark-v2/
β βββ phi35-mini-rutgers-nullspace-benchmark-v3/
β βββ llama3-1-8b-rutgers-benchmark/ # Pending
β βββ smollm2-1p7b-rutgers-benchmark/ # Pending
β βββ gemma2-2b-seq/ # Pending (sequential)
β βββ mistral-7b-seq/ # Pending (sequential)
β βββ ... (more from sequential runner)
βββ job-stage/ # Temporary per-job project copies
βββ job-cache/ # Temporary per-job HF model downloads
7. What To Do Next
Step 1: Check if jobs finished
ssh vp752@ilab.cs.rutgers.edu
squeue -u vp752
sacct -u vp752 --starttime=2026-04-21
Step 2: List all completed results
find /common/users/vp752/iconoclast_ilabs/checkpoints/ -name batch_summary.json -newer /common/users/vp752/iconoclast_ilabs/checkpoints/phi35-mini-rutgers-nullspace-benchmark-v3/batch_summary.json
Step 3: Run Qwen3.5-9B if disk is free
# Clean old caches first
rm -rf /common/users/vp752/iconoclast_ilabs/job-cache/*
# Then submit
cd ~/iconoclast
ICONOCLAST_CONFIG_TEMPLATE=config.qwen3_5_9b_base.benchmark.rutgers.toml \
ICONOCLAST_RUN_NAME=qwen3-5-9b-base-rutgers-benchmark-v2 \
sbatch scripts/run_rutgers_ilabs.slurm
Step 4: Verify the HERETIC Baselines
To prove ICONOCLAST is better, we need a side-by-side comparison with the standard HERETIC ablation (orthogonal ablation without null-space projection).
scripts/run_heretic_baselines.slurmis queued to run automatically after the main sweep.- It will produce
batch_summary.jsonfiles for all HERETIC models.
Step 5: Large-N Evaluation (520 Prompts)
To provide statistically significant proof, we evaluate the best trial configurations on a 520-prompt holdout set (mlabonne/harmful_behaviors).
scripts/run_large_eval_sweep.slurmis queued to run automatically after the HERETIC baselines.- The results for each model will be written to
/common/users/vp752/iconoclast_ilabs/large_evals/<model-name>_large_eval.json.
Step 6: Generate the final comparison table
python3 ~/iconoclast/scripts/summarize_multimodel_benchmark.py \
--spec "Qwen3-1.7B|.../iconoclast/batch_summary.json|.../heretic/batch_summary.json" \
# ... one --spec per model
Step 6: Update PUBLISHABLE_RESULTS.md
Fill in the pending rows in the results table with actual numbers.
Step 7: Write the in-depth analysis
Key questions to answer:
- Scaling hypothesis: Does KL divergence decrease with model size? (Compare 1B vs 3B vs 8B vs 9B)
- Architecture universality: Does ICONOCLAST work across Qwen, Llama, Gemma, Mistral, Phi, etc.?
- Base vs Instruct: Is the raw base model easier to edit than the RLHF-aligned instruct model?
8. Environment Variables Reference
| Variable | Purpose |
|---|---|
ICONOCLAST_CONFIG_TEMPLATE |
Which .toml config file to use |
ICONOCLAST_RUN_NAME |
Unique name for the Optuna study (changing this forces a fresh study) |
ICONOCLAST_EXIT_AFTER_OPTIMIZATION |
Set true for batch mode (no interactive menu) |
ICONOCLAST_STUDY_CHECKPOINT_DIR |
Where Optuna DB + batch_summary.json are saved |
HF_TOKEN |
HuggingFace token for gated repos |
PERSIST_ROOT |
/common/users/vp752/iconoclast_ilabs |
9. Quick Reference Commands
# Sync local changes to cluster
./scripts/sync_to_rutgers.sh
# Submit a single model benchmark
ICONOCLAST_CONFIG_TEMPLATE=config.xxx.toml \
ICONOCLAST_RUN_NAME=xxx-benchmark \
sbatch scripts/run_rutgers_ilabs.slurm
# Submit the sequential 7-model runner
sbatch scripts/run_sequential_benchmark.slurm
# Cancel a job
scancel <JOBID>
# Check disk usage
du -sh /common/users/vp752/iconoclast_ilabs/job-cache/
# Clean up all caches (only when no jobs are running!)
rm -rf /common/users/vp752/iconoclast_ilabs/job-cache/*
rm -rf /common/users/vp752/iconoclast_ilabs/job-stage/*