RL Casino Models - a ScottBiggs2 Collection

ScottBiggs2 's Collections

RL Casino Models

RL Casino Models

updated Jan 16

Model checkpoints generated during an ongoing research effort into the acceleration potential and tuning quality of LLMs with RL fine tuning.

ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Triton-Sparse

Text Generation • 8B • Updated Dec 30, 2025
ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Baseline

Text Generation • 8B • Updated Dec 30, 2025 • 1