YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
RL-Optimized Chat Model
This repository contains an RL-optimized chat model based on TinyLlama/TinyLlama-1.1B-Chat-v1.0.
Model Description
This model uses a DQN (Deep Q-Network) agent to select the best prompting strategy for a base language model.
Usage
from huggingface_hub import hf_hub_download
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# 1. Download the RL model
rl_model_path = hf_hub_download(repo_id="yqq1231231/emotional_agent", filename="best_model.pt")
# 2. Load the base language model
base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
base_model = AutoModelForCausalLM.from_pretrained(base_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# 3. Load the RL model
rl_agent = torch.load(rl_model_path)
# Now you can use both models together!
Training
This model was trained using reinforcement learning to optimize prompt selection for better responses.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support