YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
RL-Optimized Chat Model
This repository contains an RL-optimized chat model based on TinyLlama/TinyLlama-1.1B-Chat-v1.0.
Model Description
This model uses a DQN (Deep Q-Network) agent to select the best prompting strategy for a base language model.
Usage
from huggingface_hub import hf_hub_download
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# 1. Download the RL model
rl_model_path = hf_hub_download(repo_id="yqq1231231/emotional_agent", filename="best_model.pt")
# 2. Load the base language model
base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
base_model = AutoModelForCausalLM.from_pretrained(base_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# 3. Load the RL model
rl_agent = torch.load(rl_model_path)
# Now you can use both models together!
Training
This model was trained using reinforcement learning to optimize prompt selection for better responses.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support