metadata
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-8B
tags:
- lora
- fine-tuned
- beautifulsoup
- html-parsing
- rl-trained
pipeline_tag: text-generation
BS4-Qwen3-8B Step 395
Qwen3-8B fine-tuned on BeautifulSoup HTML parsing tasks using reinforcement learning.
Training Details
- Base model: Qwen/Qwen3-8B
- Training method: RL with prime-rl
- Training steps: 395
- Hardware: 2x H100 80GB
- LoRA config: rank=8, alpha=32
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
License
Apache 2.0 (same as base model)