qwen3-8b-bs4-rl / README.md
seconds-0's picture
Upload folder using huggingface_hub
5f0a9b3 verified
metadata
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-8B
tags:
  - lora
  - fine-tuned
  - beautifulsoup
  - html-parsing
  - rl-trained
pipeline_tag: text-generation

BS4-Qwen3-8B Step 395

Qwen3-8B fine-tuned on BeautifulSoup HTML parsing tasks using reinforcement learning.

Training Details

  • Base model: Qwen/Qwen3-8B
  • Training method: RL with prime-rl
  • Training steps: 395
  • Hardware: 2x H100 80GB
  • LoRA config: rank=8, alpha=32
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

License

Apache 2.0 (same as base model)