Commit History

Fix health check: use Python instead of curl
abb18c6

mahithakur Claude Haiku 4.5 commited on

Fix HuggingFace Space deployment: install curl for health checks and increase startup timeout
49487a6

mahithakur Claude Haiku 4.5 commited on

Fix healthcheck to use correct port with PORT environment variable
3eeae84

mahithakur commited on

Fix root path 404: add redirect from / to /ui/
1ab43d1

mahithakur commited on

Fix HuggingFace Space deployment: use same-origin WebSocket and dynamic port
840c18a

mahithakur Claude Haiku 4.5 commited on

Fix WebSocket protocol for HTTPS deployment: use wss:// for secure connections
d8bca9a

mahithakur Claude Haiku 4.5 commited on

Rewrite README with philosophical, accessible prose for broader audience
7b76f88

mahithakur commited on

formated readme
df53ef9

mahithakur commited on

Clean up submission links table formatting
b936ff4

mahithakur commited on

Update README with HF Space URL and submission links
d69b3cd

mahithakur commited on

Remove VS cache files, add to gitignore
b8587e5

mahithakur commited on

Add blog post and HF submission checklist for hackathon
6ba15c6

mahithakur commited on

Model name update
9d6a95e

mahithakur commited on

Updated readme
02fa6e2

mahithakur commited on

Document Colab 100-step training summary in README
759305e

mahithakur commited on

Readme cleanup
4bb4e67

mahithakur commited on

Add 100-step Colab GRPO training results
d25e8b9

mahithakur commited on

Fix eval decode prompt length slicing
fd3e88c

mahithakur commited on

Fix GRPO reward mapping and evaluation generation
b2874f4

mahithakur commited on

Include prompt column for GRPOTrainer
4fb3c37

mahithakur commited on

Use finite datasets.Dataset for GRPOTrainer compatibility
3476016

mahithakur commited on

Yield tensors from GRPO dataset generator
e817e69

mahithakur commited on

Fix GRPO dataloader batching on Kaggle
8568d9f

mahithakur commited on

Fix GRPO sample id mapping and Kaggle training setup
2066092

mahithakur commited on

Add sample training results and judge report for hackathon submission
95658b4

mahithakur commited on

Fix division by zero in final avg improvement calculation
022be04

mahithakur commited on

Fix category enum: use lowercase for ProbeAction categories
b91005f

mahithakur commited on

Fix eval_report.py: use step() return value and extract reward from observation
b3ab235

mahithakur commited on

Remove unsupported data_collator argument from GRPOTrainer
97ddd73

mahithakur commited on

Add data collator to GRPOTrainer for on-the-fly prompt tokenization
f2d68bf

mahithakur commited on

Revert to raw prompt format—let GRPOTrainer handle tokenization
e49866a

mahithakur commited on

Convert tokenized outputs to torch tensors for proper batching
9ba7512

mahithakur commited on

Tokenize prompts in dataset generator for GRPOTrainer compatibility
75be7bc

mahithakur commited on

Fix prompt format: convert from chat list to string for GRPOTrainer compatibility
77793ce

mahithakur commited on

Fix dataset format: only yield 'prompt' field to avoid tensor concatenation errors
80c91ac

mahithakur commited on

Remove tokenizer argument from GRPOTrainer—not supported in installed TRL version
4c16e6a

mahithakur commited on

Fix path bootstrap in train_grpo.py: use parent.parent to reach project root
bc2ac25

mahithakur commited on

Fix GRPO batch size config for Kaggle P100: batch_size=4, grad_accum=2 (global=8, divisible by num_generations=2)
feefe4a

mahithakur commited on

Add eval_report.py for before/after training comparison
4e029fe

mahithakur commited on

Add pre-training baseline results and graphs
754af78

mahithakur commited on

Fix JSON parsing and environment bugs
c22ceaa

mahithakur commited on

BlogPost for Huggeing later use
02eeb03

Thakur, Mahipal commited on

Updated readme
cd9d2e3

Thakur, Mahipal commited on

UI Integration
44bd7bd

Thakur, Mahipal commited on

Added Meaning ful comments
4ec7361

Thakur, Mahipal commited on

Code improvenets
fa66cd4

Thakur, Mahipal commited on

Updated readme file
ab07180

Thakur, Mahipal commited on

refactor: remove legacy architecture, promote clean structure to repo root
85fab7b

Thakur, Mahipal commited on

Worked on folder structure
bb51474

Thakur, Mahipal commited on

Judge matching chnages with name changes
104c835

Thakur, Mahipal commited on