ayshajavd
/

vuln-classifier-training-notebooks

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

🔒 Vulnerability Classifier Training Notebooks

✅ ALL 4 NOTEBOOKS COMPLETE — Project Finished

Final Results

Model	Key Metric	Score
GraphCodeBERT Classifier	Macro F1	0.476 (+311% vs baseline 0.116)
	Weighted F1	0.945
	Safe Detection F1	0.982
CodeT5+ Fixer	BLEU	81.0
	ROUGE-L	0.788
	Eval Loss	0.175 (3.1x better than v1's 0.547)

📋 Run Order (on Kaggle with free T4 GPU)

Notebook 1: Classifier Phase 1 (~2-3 hours) ✅ COMPLETE

Loads microsoft/graphcodebert-base
Freezes bottom 8/12 layers
Trains top layers + classifier head with ASL loss (4 epochs)
Saved to HF Hub branch phase1-checkpoint

Notebook 2: Classifier Phase 2 (~4-5 hours) ✅ COMPLETE

Loads Phase 1 checkpoint from HF Hub
Unfreezes ALL layers
Full fine-tuning with lower LR (5 epochs, early stopping)
Saved to HF Hub branch phase2-checkpoint

Notebook 3: Thresholds + Calibration (~30 min) ✅ COMPLETE

Loads Phase 2 model
Per-class threshold optimization on validation set
Temperature scaling calibration (T=0.6163)
Full test set evaluation with classification report
Pushed final model to main branch with all configs

Notebook 4: Fixer Training (~7 hours) ✅ COMPLETE

Use notebook4_fixer_training_v3_FINAL.py (the definitive version)
Trained CodeT5+ 220M with CWE-aware input format
lr=1e-4, constant scheduler (T5APR/MultiMend validated)
Early stopped at epoch 6 (best=epoch 3, eval_loss=0.1752)
BLEU=81.0, ROUGE-L=0.788 on test set
Pushed to HF Hub

🚀 Deployed Resources

Resource	URL	Status
Classifier Model	graphcodebert-vuln-classifier	✅ Live
Fixer Model	codet5p-vuln-fixer	✅ Live
Dataset	code-security-vulnerability-dataset	✅ 175K samples
Demo Space	code-security-analyzer	✅ v2 deployed

What Was Improved (v1 → v2)

Improvement	Description
GraphCodeBERT-base	125M params, 12 layers (was CodeBERTa-small 83M, 6 layers)
Asymmetric Loss (ASL)	γ⁻=4, γ⁺=0 — designed for 90% safe class imbalance
Two-phase training	Phase 1: freeze bottom 8 layers → Phase 2: full fine-tune
Per-class thresholds	Optimal threshold per CWE (not global 0.3)
Temperature calibration	Probabilities become meaningful (T=0.6163)
CodeT5+ 220M fixer	3.7x larger than old flan-t5-small
CWE-aware input	Fixer model knows what vulnerability to fix
lr=1e-4 constant	Research-validated (T5APR + MultiMend papers)
BLEU + ROUGE eval	Proper fix quality evaluation

⚠️ Notebook 4 Notes

Use notebook4_fixer_training_v3_FINAL.py — the other versions have bugs:

notebook4_fixer_training.py — ❌ Original (15 critical bugs)
notebook4_fixer_training_v2_FIXED.py — ❌ Partially fixed (still crashes)
notebook4_fixer_training_v3_FINAL.py — ✅ All bugs fixed

Key modifications needed for Kaggle (2025):

Cell 1: Uninstall peft first (pip uninstall -y peft) to fix StrictDataclassDefinitionError
Cell 8: Use lr=1e-4, lr_scheduler_type="constant", fp16=True, predict_with_generate=False
Cell 9: Add trainer.args.predict_with_generate = True before trainer.predict()

📁 Files

notebook1_classifier_phase1.py — Phase 1 training ✅
notebook2_classifier_phase2.py — Phase 2 fine-tuning ✅
notebook3_thresholds_calibration_eval.py — Optimization + evaluation ✅
notebook4_fixer_training_v3_FINAL.py — ✅ Fixer training (use this one)
notebook4_fixer_training.py — ❌ Original (broken)
notebook4_fixer_training_v2_FIXED.py — ❌ Partially fixed (still crashes)
updated_app.py — ✅ Deployed to Space (v2 with calibration + thresholds)

📚 References

T5APR (arxiv:2309.15742) — lr=1e-4, constant scheduler for CodeT5 code repair
MultiMend (arxiv:2501.16044) — Same config validated on 6 APR benchmarks

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for ayshajavd/vuln-classifier-training-notebooks

MultiMend: Multilingual Program Repair with Context Augmentation and Multi-Hunk Patch Generation

Paper • 2501.16044 • Published Jan 27, 2025

T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemble

Paper • 2309.15742 • Published Sep 27, 2023 • 1