Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
4
long
seamoke111
Follow
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 16 hours ago
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks
upvoted
a
paper
6 months ago
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents
upvoted
a
paper
7 months ago
The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
View all activity
Organizations
None yet
seamoke111
's models
1
Sort: Recently updated
seamoke111/HTL-CodeLlama-7B
Text Generation
•
7B
•
Updated
Jun 20, 2024
•
1