Spaces:

israaaML
/

fsds_cleaning_env

Sleeping

App Files Files Community

fsds_cleaning_env

303 kB

Ctrl+K

Ctrl+K

2 contributors

History: 6 commits

israaaML's picture

Claude Sonnet 4.6

add compare_agents.py: 4-way benchmark (Random/Heuristic/SFT/GRPO)

2968ead 2 months ago

configs
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
examples
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
server
fix: sanitize numpy/pandas types in submit_solution JSON serialization 2 months ago
tests
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
training
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
.dockerignore

80 Bytes
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
.gitattributes

1.52 kB
initial commit 2 months ago
.gitignore

109 Bytes
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
AGENT_GUIDE.md

17.5 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
AGENT_PLAN.md

11 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
Dockerfile

574 Bytes
Upload folder using huggingface_hub 2 months ago
FINAL_REPORT.md

8.09 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
README.md

9.69 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
__init__.py

520 Bytes
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
agents.py

16.9 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
benchmark_guides.md

4.42 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
client.py

502 Bytes
Upload folder using huggingface_hub 2 months ago
compare_agents.py

7.33 kB
add compare_agents.py: 4-way benchmark (Random/Heuristic/SFT/GRPO) 2 months ago
curriculum.py

8.6 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
dataset_generators.py

16.7 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
demonstrations.py

18.5 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
evaluate_agent.py

6.03 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
evaluation_tasks.py

2.21 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
metrics.py

4.65 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
model.py

106 Bytes
Upload folder using huggingface_hub 2 months ago
models.py

1.17 kB
Upload folder using huggingface_hub 2 months ago
openenv.yaml

100 Bytes
Upload folder using huggingface_hub 2 months ago
plan_results.md

22.5 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
project_recommendations.md

7.36 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
pyproject.toml

909 Bytes
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
results_heuristic.json

14.4 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
results_llm.json

12.1 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
results_random.json

13.9 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
reward.py

2.17 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago
training_colab.py

16 kB
v3: benchmark results, final report, agent/eval improvements, smoke test fixes 2 months ago
training_sft.py

7.37 kB
v2: curriculum scheduling, SFT pipeline, reward redesign, agent guide 2 months ago