ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Triton-Sparse
Text Generation
•
8B
•
Updated
•
26
Model checkpoints generated during an ongoing research effort into the acceleration potential and tuning quality of LLMs with RL fine tuning.