YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

RL-Optimized Chat Model

This repository contains an RL-optimized chat model based on TinyLlama/TinyLlama-1.1B-Chat-v1.0.

Model Description

This model uses a DQN (Deep Q-Network) agent to select the best prompting strategy for a base language model.

Usage

from huggingface_hub import hf_hub_download
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# 1. Download the RL model
rl_model_path = hf_hub_download(repo_id="yqq1231231/emotional_agent", filename="best_model.pt")

# 2. Load the base language model
base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
base_model = AutoModelForCausalLM.from_pretrained(base_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# 3. Load the RL model
rl_agent = torch.load(rl_model_path)

# Now you can use both models together!

Training

This model was trained using reinforcement learning to optimize prompt selection for better responses.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support