qwen3-8b-bs4-rl / README.md

seconds-0

Upload folder using huggingface_hub

5f0a9b3 verified 4 months ago

preview code

raw

history blame contribute delete

617 Bytes

metadata

library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-8B
tags:
  - lora
  - fine-tuned
  - beautifulsoup
  - html-parsing
  - rl-trained
pipeline_tag: text-generation

BS4-Qwen3-8B Step 395

Qwen3-8B fine-tuned on BeautifulSoup HTML parsing tasks using reinforcement learning.

Training Details

Base model: Qwen/Qwen3-8B
Training method: RL with prime-rl
Training steps: 395
Hardware: 2x H100 80GB
LoRA config: rank=8, alpha=32
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

License

Apache 2.0 (same as base model)