PEFT
qlora
sft
trl
qwen3
tmf921
intent-based-networking
network-slicing
rtx-6000-ada
ml-intern

Commit History

Fix RFT: batch generation, reduce to 200 prompts x 8 samples (~24h feasible on RTX 6000 Ada)
05ea6fa
verified

nraptisss commited on

Add Best-of-N rejection sampling + RFT pipeline for value fidelity improvement
f82a9bd
verified

nraptisss commited on