Spaces:

LorenzoNava
/

deberta-cwe-training

Paused

App Files Files Community

deberta-cwe-training / README.md

LorenzoNava

fix: Correct hardware configuration to 4x-l4 (4x NVIDIA L4 GPUs)

aacbb01 5 months ago

preview code

raw

history blame contribute delete

2.23 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: DeBERTa CWE Classification Training
emoji: 🤖
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false
license: mit
hardware: 4x-l4
python_version: '3.10'
disable_embedding: false

🤖 DeBERTa CWE Classification - Fine-Tuning Interface

Production-grade Gradio interface for training DeBERTa models on CVE→CWE classification task with real-time monitoring.

Features

📊 Real-time Training Monitoring - Live progress updates, metrics streaming
📈 Interactive Dashboard - Visualize loss, accuracy, F1 score, learning rate
⚙️ Hyperparameter Configuration - Full control over training parameters
💾 Model Export - Automatic export to local directory
🎯 Optimal Settings - Pre-configured with best hyperparameters (10 epochs, batch 16)
🔥 GPU Acceleration - Automatic CUDA/MPS/CPU detection
⏸️ Training Control - Early stopping, checkpoint management
📝 Live Logs - Real-time training logs streaming

Dataset

Name: stasvinokur/cve-and-cwe-dataset-1999-2025 Size: ~300K CVE-CWE pairs from 1999-2025 Task: Single-label classification (CVE description → CWE-ID)

Optimal Hyperparameters

Epochs: 10 (for best quality)
Batch Size: 16 (effective: 64 with gradient accumulation)
Learning Rate: 2e-5 with cosine schedule
Warmup Ratio: 0.1
Gradient Accumulation: 4 steps
Early Stopping: 5 patience

Usage

Select model architecture (Base recommended)
Configure hyperparameters (or use defaults)
Click "🚀 Start Training"
Monitor real-time progress in dashboard
Model exports automatically to local directory

Output

Trained model saved within Space to: ./models/deberta-cwe-final/

To use locally: Download files from Space's "Files" tab and copy to your local CWE MCP directory.

Hardware

This Space requires GPU for efficient training. Configured with A10G Large for optimal performance.

Training Time Estimates:

DeBERTa-Base on A10G: ~2-3 hours
DeBERTa-Large on A10G: ~6-8 hours

Developed By

Berghem - Smart Information Security Licensed under MIT

Force rebuild at Tue Nov 18 08:20:41 -03 2025