Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
Veer15
/
faultline-env-train
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
faultline-env-train
/
training
78.3 kB
Ctrl+K
Ctrl+K
3 contributors
History:
44 commits
Viraj
docs: correct HF endpoint deployment command
007ac94
about 1 month ago
artifacts
Add episode replay dashboard and GRPO training pipeline
about 1 month ago
config
config: max_steps 100 -> 60 to fit budget at empirical 150s/step
about 1 month ago
env_adapter
env_client: retry transient 5xx + typed EnvUnavailableError; trainer survives env outages
about 1 month ago
grpo
env_client: retry transient 5xx + typed EnvUnavailableError; trainer survives env outages
about 1 month ago
jobs
trainer: A+B+C+D β soften think-mask, strict parser, add format reward, max_steps 100
about 1 month ago
notebooks
Add episode replay dashboard and GRPO training pipeline
about 1 month ago
prompts
Add episode replay dashboard and GRPO training pipeline
about 1 month ago
publish
publish: mark merged models as transformers text-generation artifacts
about 1 month ago
rollouts
Fix training pipeline: TRL>=0.25 rollout, real generation, curriculum/W&B callbacks
about 1 month ago
spaces
docs: correct HF endpoint deployment command
about 1 month ago
README.md
10.4 kB
Switch base model to Qwen/Qwen3-8B (Qwen3.5-9B is multimodal Qwen3_5ForConditionalGeneration, unsupported by unsloth)
about 1 month ago
__init__.py
Safe
0 Bytes
Add episode replay dashboard and GRPO training pipeline
about 1 month ago