Commit History

Strip leaked W&B API key from training notebook
345af89
verified

Swastikr commited on

Drop 'small model' claim (training not yet converged)
fec9b0b
verified

Swastikr commited on

Point to new 300-step 7B W&B run (axwh8hrb)
ab92c1f
verified

Swastikr commited on

Drop 'small model' claim (training not yet converged)
16e8be7
verified

Swastikr commited on

Strip em-dashes; fix trap-library facts (4 held-out, 7 categories); explicit hardware-optimisation framing; cleaner 1x2 results grid
4a8c955
verified

Swastikr commited on

Drop 'small model' claim (training not yet converged)
e7387a6
verified

Swastikr commited on

Point to new 300-step 7B W&B run (axwh8hrb)
75e0dc4
verified

Swastikr commited on

Point to new 300-step 7B W&B run (axwh8hrb)
7be363f
verified

Swastikr commited on

Sync chunk: docs (diagrams + plots)
22bb5b2
verified

Swastikr commited on

Sync chunk: training scripts + notebook
d377bd4
verified

Swastikr commited on

Sync chunk: top-level files (md + manifest + dockerfile + pyproject)
08c7594
verified

Swastikr commited on

Point to new 300-step 7B W&B run (axwh8hrb)
a97a57b
verified

Swastikr commited on

Drop 'small model' claim (training not yet converged)
7f64b0e
verified

Swastikr commited on

Drop 'small model' claim (training not yet converged)
4611d54
verified

Swastikr commited on

Add training metrics grid; fix hardware-target claims (no RISC-V/Cortex-A78); add HF Jobs link
f6092ff
verified

Swastikr commited on

Push diagrams + updated README (blog assets, env explainer)
8d51569
verified

Swastikr commited on

Auto-update blog with latest job results
21843b0
verified

Swastikr commited on

Add HF job links (login-gated) to blog links section
8cc4b1d
verified

Swastikr commited on

Add public W&B run link for 300-step rerun
31cf2be
verified

Swastikr commited on

Upload training_runs/partial-20260426-061351/grpo_component_means.png with huggingface_hub
1e1bb8a
verified

Swastikr commited on

Upload training_runs/partial-20260426-061351/grpo_reward_curve.png with huggingface_hub
b38e4e6
verified

Swastikr commited on

Upload training_runs/partial-20260426-061351/baseline_vs_trained_metrics.png with huggingface_hub
e008f4d
verified

Swastikr commited on

Upload training_runs/partial-20260426-061351/reward_distribution.png with huggingface_hub
f854c6b
verified

Swastikr commited on

Upload training_runs/partial-20260426-061351/summary.json with huggingface_hub
56e1a53
verified

Swastikr commited on

Upload training_runs/partial-20260426-060026/grpo_component_means.png with huggingface_hub
4284d77
verified

Swastikr commited on

Upload training_runs/partial-20260426-060026/grpo_reward_curve.png with huggingface_hub
cb869de
verified

Swastikr commited on

Upload training_runs/partial-20260426-060026/baseline_vs_trained_metrics.png with huggingface_hub
ccde9ce
verified

Swastikr commited on

Upload training_runs/partial-20260426-060026/reward_distribution.png with huggingface_hub
15ac920
verified

Swastikr commited on

Upload training_runs/partial-20260426-060026/summary.json with huggingface_hub
fc22824
verified

Swastikr commited on

Sync latest submission updates (GRPO-only + blog markdown + bugfixes)
bca801b
verified

Swastikr commited on

Upload training_runs/partial-20260425-211348/reward_distribution.png with huggingface_hub
015a9ee
verified

Swastikr commited on

Upload training_runs/partial-20260425-211348/summary.json with huggingface_hub
502d078
verified

Swastikr commited on

Upload training_runs/partial-20260425-210137/reward_distribution.png with huggingface_hub
d418572
verified

Swastikr commited on

Upload training_runs/partial-20260425-210137/summary.json with huggingface_hub
ea93dca
verified

Swastikr commited on

Upload training_runs/partial-20260425-205532/reward_distribution.png with huggingface_hub
e6affec
verified

Swastikr commited on

Upload training_runs/partial-20260425-205532/summary.json with huggingface_hub
163552c
verified

Swastikr commited on

Improve training data quality, teacher policy, ablation, and reward-aligned stage
26a1334
verified

Swastikr commited on

Upload training_runs/partial-20260425-161650/reward_distribution.png with huggingface_hub
76e7027
verified

Swastikr commited on

Upload training_runs/partial-20260425-161650/summary.json with huggingface_hub
285f68a
verified

Swastikr commited on

Add LoRA-enabled script runner with auth-safe snapshot download
46d8540
verified

Swastikr commited on

Add eval progress prints for long runs
adb3c10
verified

Swastikr commited on

Improve notebook runner logging and timeout safety
7469aa3
verified

Swastikr commited on

Use LoRA adapter tuning to avoid T4 OOM
c10efea
verified

Swastikr commited on

Sync latest notebook training fixes
9e86376
verified

Swastikr commited on

Add notebook runner script for HF Jobs
7990b41
verified

Swastikr commited on

Use pure fp16 weights without AMP scaler in SFT
0460459
verified

Swastikr commited on

Disable grad clipping for stable fp16 SFT on T4
973fe97
verified

Swastikr commited on

Fix CUDA dtype and device handling for full training
70766a6
verified

Swastikr commited on

Enable GPU-friendly full training config
ebc80d7
verified

Swastikr commited on

Upload training_runs/notebooks/openenv_hackathon_training.partial.20260425-124204.ipynb with huggingface_hub
bec441e
verified

Swastikr commited on