Refine GRPO evaluation details and clarify model performance comparisons in Blog.md 83f7214 Running thepikachu commited on Apr 26
Revise reasoning for choosing SFT + agentic loop over SFT + GRPO in deployment documentation 94c19b9 thepikachu commited on Apr 26
Update README.md to correct title and enhance color scheme, app configuration, and metadata b8672db thepikachu commited on Apr 26
Add initial blog post detailing ArchitectureEnv and its design approach 29c929e thepikachu commited on Apr 26
Add front matter to README.md for enhanced metadata and project visibility 1cbc836 thepikachu commited on Apr 26
Update README.md for improved clarity and structure, including quick start instructions and enhanced project description. 26e1800 thepikachu commited on Apr 26
Add analysis of SFT vs GRPO performance and rationale for model selection c7bd5c9 thepikachu commited on Apr 26
Remove outdated inference reward curve plot and add new loss and reward curve plots for improved analysis. ebba8ad thepikachu commited on Apr 26
Update LocalModelClient initialization to use MODEL_REPO_ID instead of MODEL_DIR 1c83fd8 thepikachu commited on Apr 25
Update README.md with installation instructions and modify agentic_inference.py to print MODEL_REPO_ID instead of MODEL_DIR 0ede54d thepikachu commited on Apr 25
Refactor code structure for improved readability and maintainability 19e79af thepikachu commited on Apr 5
Revise README.md for clarity and structure; update benchmark description and task details 7731b28 thepikachu commited on Apr 4
Simplify main function by removing host and port arguments in app.py d932af4 thepikachu commited on Apr 4