Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Spaces:
Veer15
/
faultline-env-train
Sleeping

App Files Files Community
Fetching metadata from the HF Docker repository...
faultline-env-train / training
78.3 kB
Ctrl+K
Ctrl+K
  • 3 contributors
History: 44 commits
Viraj
docs: correct HF endpoint deployment command
007ac94 about 1 month ago
  • artifacts
    Add episode replay dashboard and GRPO training pipeline about 1 month ago
  • config
    config: max_steps 100 -> 60 to fit budget at empirical 150s/step about 1 month ago
  • env_adapter
    env_client: retry transient 5xx + typed EnvUnavailableError; trainer survives env outages about 1 month ago
  • grpo
    env_client: retry transient 5xx + typed EnvUnavailableError; trainer survives env outages about 1 month ago
  • jobs
    trainer: A+B+C+D β€” soften think-mask, strict parser, add format reward, max_steps 100 about 1 month ago
  • notebooks
    Add episode replay dashboard and GRPO training pipeline about 1 month ago
  • prompts
    Add episode replay dashboard and GRPO training pipeline about 1 month ago
  • publish
    publish: mark merged models as transformers text-generation artifacts about 1 month ago
  • rollouts
    Fix training pipeline: TRL>=0.25 rollout, real generation, curriculum/W&B callbacks about 1 month ago
  • spaces
    docs: correct HF endpoint deployment command about 1 month ago
  • README.md
    10.4 kB
    Switch base model to Qwen/Qwen3-8B (Qwen3.5-9B is multimodal Qwen3_5ForConditionalGeneration, unsupported by unsloth) about 1 month ago
  • __init__.py
    0 Bytes
    Add episode replay dashboard and GRPO training pipeline about 1 month ago