AI & ML interests
None yet
Organizations
None yet
saketh-chervu/qwen3-06b-wordle-sft-all-data
Text Generation
•
0.6B
•
Updated
•
1
saketh-chervu/qwen3-06b-wordle-sft-phase2-best
Text Generation
•
0.6B
•
Updated
•
1
saketh-chervu/qwen3-06b-wordle-dpo-phase2-best
Text Generation
•
0.6B
•
Updated
•
1
saketh-chervu/qwen3-06b-wordle-dpo-phase2
Text Generation
•
0.6B
•
Updated
•
1
saketh-chervu/qwen3-06b-wordle-sft-phase1-best
Text Generation
•
0.6B
•
Updated
saketh-chervu/wordle-agent-sft-with-dpo-golden-pairs-qwen3-06b
Text Generation
•
0.6B
•
Updated
•
1
saketh-chervu/wordle-agent-sft-with-dpo-golden-pairs
Text Generation
•
8B
•
Updated
saketh-chervu/wordle-agent-OpenThinker3-7B-sft
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/wordle-agent-OpenThinker3-7B-sft-golden
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/wordle-agent-llama-dpo-after-sft
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/wordle-agent-llama-sft
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/wordle-agent-dpo-checkpoint-3000
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/wordle-agent-dpo-checkpoint-1806
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/wordle-agent-sft-golden-2
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/llama31-8b-instruct-sft-golden-ft-wordle-agent
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/llama31-8b-instruct-sft-ft-wordle-agent
Text Generation
•
8B
•
Updated
•
1
saketh-chervu/llama3-1b-instruct-sft-ft-wordle-agent
Text Generation
•
1B
•
Updated
•
1
saketh-chervu/llama3-1b-instruct-sft-wordle-agent
Text Generation
•
1B
•
Updated
•
4
saketh-chervu/qlearning-taxi-v3
Reinforcement Learning
•
Updated
saketh-chervu/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
saketh-chervu/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
•
1
saketh-chervu/distilroberta-base-finetuned-distilroberta
Fill-Mask
•
Updated
•
2
saketh-chervu/distilgpt2-finetuned-wikitext2
Updated