Supreeth Rao

Supreeth

5 20

https://supreethrao.com

AI & ML interests

Reinforcement Learning, Large Language Models, Distributed Computing

Recent Activity

updated a collection 4 days ago

SearchLM

updated a collection 4 days ago

SearchLM

updated a collection 4 days ago

SearchLM

View all activity

Organizations

None yet

updated a collection 4 days ago

SearchLM

Collection

NL2BM25: teaching Qwen2.5-3B to generate Tantivy boolean queries via SFT + GRPO. Covers reward hacking (GRPO v1) and the shaped-reward fix (GRPO v2). • 4 items • Updated 4 days ago

updated a model 4 days ago

Supreeth/searchlm-nl2bm25-grpo-v2

Text Generation • 3B • Updated 4 days ago • 48

published a model 4 days ago

Supreeth/searchlm-nl2bm25-grpo-v2

Text Generation • 3B • Updated 4 days ago • 48

updated a model 4 days ago

Supreeth/searchlm-nl2bm25-grpo

Text Generation • 3B • Updated 4 days ago • 49

published a model 4 days ago

Supreeth/searchlm-nl2bm25-grpo

Text Generation • 3B • Updated 4 days ago • 49

updated a model 4 days ago

Supreeth/searchlm-nl2bm25-sft-v2

Text Generation • 3B • Updated 4 days ago • 47

published a model 4 days ago

Supreeth/searchlm-nl2bm25-sft-v2

Text Generation • 3B • Updated 4 days ago • 47

updated a model 4 days ago

Supreeth/searchlm-nl2bm25-sft

Text Generation • 3B • Updated 4 days ago • 62

published a model 4 days ago

Supreeth/searchlm-nl2bm25-sft

Text Generation • 3B • Updated 4 days ago • 62

updated a dataset 4 days ago

Supreeth/nl2bm25-sft

Viewer • Updated 4 days ago • 5.08k • 41

published a dataset 4 days ago

Supreeth/nl2bm25-sft

Viewer • Updated 4 days ago • 5.08k • 41

updated 2 models 2 months ago

Supreeth/verirl-rlvr-qwen3-4b-thinking

Updated Apr 26

Supreeth/verirl-sft-qwen3-4b-tooluse-merged

Text Generation • 4B • Updated Apr 26 • 52

published a model 2 months ago

Supreeth/verirl-sft-qwen3-4b-tooluse-merged

Text Generation • 4B • Updated Apr 26 • 52

updated a model 2 months ago

Supreeth/verirl-sft-qwen3-4b-tooluse

Updated Apr 26

published a model 2 months ago

Supreeth/verirl-sft-qwen3-4b-tooluse

Updated Apr 26

updated a Space 2 months ago

VeriRL — Verilog RTL Design Environment

🔬

Step through a Verirl environment by sending actions and view results

Supreeth Rao

AI & ML interests

Recent Activity

Organizations

Supreeth's activity

VeriRL — Verilog RTL Design Environment