PEFT
qlora
sft
trl
qwen3
tmf921
intent-based-networking
network-slicing
rtx-6000-ada
ml-intern

Commit History

Add nohup launcher for focused 4-layer training
7559485
verified

nraptisss commited on

Add focused 4-layer SFT script: train only on tmf921/camara/3gpp/etsi_zsm (removes weak O1/A1/lifecycle layers)
0c12387
verified

nraptisss commited on

Fix RFT: batch generation, reduce to 200 prompts x 8 samples (~24h feasible on RTX 6000 Ada)
05ea6fa
verified

nraptisss commited on

Add Best-of-N rejection sampling + RFT pipeline for value fidelity improvement
f82a9bd
verified

nraptisss commited on

Add correct GRPO evaluation script that loads SFT-merged base + GRPO adapter
9b7e923
verified

nraptisss commited on

GRPO v3: fix truncation — use full 1536 completion length + G=4, we have 42GB headroom
bb2b127
verified

nraptisss commited on

Fix GRPO v2: lower temp=0.3, dense reward shaping, higher beta=0.1 to stay near SFT, G=2 safe for 48GB
b5f0025
verified

nraptisss commited on

Tune GRPO hyperparams for RTX 6000 Ada 48GB: reduce steps/grad_accum/completion_length for ~6h runtime, prevent OOM
ebe5562
verified

nraptisss commited on

Fix GRPO script: skip CPU merge, load base+adapter directly on GPU in 4-bit to avoid RAM OOM
70f2c09
verified

nraptisss commited on

Add GRPO nohup launch script for RTX 6000 Ada / A100
6f837b1
verified

nraptisss commited on

Add GRPO post-SFT training script with multi-component reward (JSON validity + key F1 + value F1 + layer-weighted bonus)
eb5665a
verified

nraptisss commited on

Upload scripts/run_all_baselines.sh
f198a32
verified

nraptisss commited on

Add baseline evaluation script for Llama/GPT-4o-mini comparison
0e7b293
verified

nraptisss commited on

Fix semantic evaluator to recover metadata by prediction id
d27f0bc
verified

nraptisss commited on

Add prototype semantic evaluator for O1 NRM and A1 policy
02caf44
verified

nraptisss commited on

Add sampled zero-shot baseline runner
e0c5f96
verified

nraptisss commited on

Add stage1 evaluation reproduction script
e16ed3a
verified

nraptisss commited on

Add qualitative failure example sampler
77fad9d
verified

nraptisss commited on

Add results packaging script
aaf8c59
verified

nraptisss commited on

Fix nohup evaluator to support merged model paths
eccc07b
verified

nraptisss commited on

Fix stage2 push_to_hub dataset metadata validation
0187cea
verified

nraptisss commited on

Add stage2 weak-layer nohup runner
c3b8793
verified

nraptisss commited on

Add stage2 adapter continuation trainer
60bd01c
verified

nraptisss commited on

Add weak-layer stage2 dataset builder
7474a91
verified

nraptisss commited on

Add normalized evaluator for existing predictions
63d52bc
verified

nraptisss commited on

Update nohup evaluator defaults for faster resumable batched generation
f4beb76
verified

nraptisss commited on

Speed up and resume OOD evaluation with batched dynamic generation
6f5475f
verified

nraptisss commited on

Harden Trackio Space validation to avoid startup crash
5a23de5
verified

nraptisss commited on

Fix TRL conversational dataset detection and remove warmup_ratio deprecation
608f732
verified

nraptisss commited on

Ensure CUDA GPU preflight and RTX 6000 Ada install path
91d636a
verified

nraptisss commited on

Add nohup run management and resumable checkpoint support
a896ecd
verified

nraptisss commited on

Add RTX 6000 Ada QLoRA training and evaluation repo
d9ba941
verified

nraptisss commited on