Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

rtferraz
/
tucano2-commerce

Model card Files Files and versions
xet
Community
tucano2-commerce / notebooks
288 kB
Ctrl+K
Ctrl+K
  • 2 contributors
History: 21 commits
rtferraz's picture
rtferraz
add: base vs tuned comparison cell for V4.2 final evaluation
0c9199c verified about 2 months ago
  • cell_comparison_base_vs_tuned.py
    16.3 kB
    add: base vs tuned comparison cell for V4.2 final evaluation about 2 months ago
  • grpo_vertex_v3.ipynb
    82.6 kB
    apply v3 task-aware thinking controls and delete deprecated notebook 2 months ago
  • v4_1_instruct_grpo.ipynb
    50.3 kB
    notebooks: add V4.1 GRPO notebook (parser fix, 600 steps, LR 5e-6, constant_with_warmup) 2 months ago
  • v4_2_instruct_grpo.ipynb
    90.8 kB
    feat(rewards): add sentiment mismatch penalty to prevent extraction reward hacking about 2 months ago
  • v4_instruct_grpo.ipynb
    47.7 kB
    v4: ROOT CAUSE FIX — use standard PEFT not Unsloth get_peft_model (fused LoRA kernels have dtype bug #4891). Revert to load_in_4bit=True, dtype=None matching V3. 2 months ago