Instructions to use aifeifei798/roleplayer-actor-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use aifeifei798/roleplayer-actor-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("./gemma-3-4b-it-qat-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "aifeifei798/roleplayer-actor-lora") - Transformers
How to use aifeifei798/roleplayer-actor-lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aifeifei798/roleplayer-actor-lora") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("aifeifei798/roleplayer-actor-lora", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aifeifei798/roleplayer-actor-lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aifeifei798/roleplayer-actor-lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aifeifei798/roleplayer-actor-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/aifeifei798/roleplayer-actor-lora
- SGLang
How to use aifeifei798/roleplayer-actor-lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aifeifei798/roleplayer-actor-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aifeifei798/roleplayer-actor-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aifeifei798/roleplayer-actor-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aifeifei798/roleplayer-actor-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use aifeifei798/roleplayer-actor-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for aifeifei798/roleplayer-actor-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for aifeifei798/roleplayer-actor-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for aifeifei798/roleplayer-actor-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="aifeifei798/roleplayer-actor-lora", max_seq_length=2048, ) - Docker Model Runner
How to use aifeifei798/roleplayer-actor-lora with Docker Model Runner:
docker model run hf.co/aifeifei798/roleplayer-actor-lora
roleplayer-actor-lora by aifeifei798
An expert-level, hyper-specialized LoRA for high-fidelity Chinese role-playing. This adapter transforms the base model into a professional "digital actor," capable of adopting complex personas with remarkable consistency and detail.
Model Description
This is not just another chatbot LoRA. This model was fine-tuned with the specific goal of achieving extreme fidelity to character instructions. It excels at:
- Deep Persona Adoption: Faithfully adheres to complex character backstories, personality traits, and linguistic styles provided in the instruction prompt.
- Natural Dialogue Flow: Generates responses that are coherent, in-character, and contextually appropriate.
- Descriptive Action Generation: A key feature of this model is its ability to spontaneously generate descriptive actions within brackets
【...】, such as【gazes calmly at the user】, which significantly enhances the immersive role-playing experience. This skill has been observed to generalize to new, unseen characters. - Extreme Specialization: The model was trained to a very low loss value (~0.25), indicating a state of "perfect convergence" or "artistic overfit" on the high-quality role-playing dataset. This makes it an incredibly "pure" and stable actor.
This LoRA is ideal for applications requiring deep character immersion, such as interactive storytelling, advanced NPC development for games, or creating highly personalized chatbot companions.
How to Use
This model is a LoRA adapter and must be loaded on top of its base model. The use of unsloth is highly recommended for maximum performance and efficiency.
First, install the necessary libraries:
pip install "unsloth[colab-new]"
pip install "transformers>=4.38.0"
pip install "torch>=2.1.0"
Then, you can use the following Python script to load the model and run inference:
import torch
from unsloth import FastLanguageModel
from transformers import pipeline
# 1. Load the base model
# This model uses the 4-bit quantized version of Gemma-3-4B-it
base_model_path = "unsloth/gemma-3-4b-it-qat-unsloth-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=base_model_path,
load_in_4bit=True,
device_map="auto",
)
# 2. Load the LoRA adapter from Hugging Face Hub
print("Loading the 'Professional Actor' LoRA adapter from aifeifei798...")
model.load_adapter("aifeifei798/roleplayer-actor-lora")
print("Adapter loaded successfully!")
# 3. Prepare the prompt using the Alpaca format
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
{}"""
# --- Example Character: "Lao Pao'er" (The Old Timer) ---
instruction = """You are a retired veteran named "Lao Pao'er," known for being hot-tempered and straight-talking, but with a heart of gold.
Your catchphrase is "Hey, I tell you, kid...".
Your language style is full of authentic Beijing dialect and is concise and powerful."""
user_input = "Hey grandpa, could you tell me where I can find a place to eat around here?"
# --- (Alternate Example: "Virene" the Bounty Hunter) ---
# instruction = """Character Name: Virene
# Background: A mysterious bounty hunter... (full description)"""
# user_input = "I need a bounty hunter for a mission. I've heard you are the best. Can we discuss a partnership?"
prompt = alpaca_prompt.format(
instruction,
user_input,
"", # The Response is left empty for the model to generate
)
# 4. Run inference
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
generation_args = {
"max_new_tokens": 256,
"do_sample": True,
"temperature": 0.7,
"top_p": 0.9,
"top_k": 50,
"pad_token_id": tokenizer.eos_token_id
}
outputs = pipe(prompt, **generation_args)
# The output includes the full prompt, so we split it to get only the response
response = outputs['generated_text'].split("### Response:").strip()
print("\n--- Model's Performance ---")
print(response)
# Expected Output (example):
# You kid don't know about Old Li's Noodle House nearby? Lots of people, great taste! 【points towards the intersection】
Training Details
- Base Model:
unsloth/gemma-3-4b-it-qat-unsloth-bnb-4bit - Dataset:
shibing624/roleplay-zh-sharegpt-gpt4-data - Training Framework:
unsloth - Hardware: 1x NVIDIA RTX 3070 (8GB VRAM)
- Key Hyperparameters:
per_device_train_batch_size: 1gradient_accumulation_steps: 8max_seq_length: 2048learning_rate: 2e-4
- Training Insight: The training was manually stopped at epoch 0.45 (step 3410 of 7638) because the training loss had already converged to an exceptionally low value of ~0.25, indicating that the model had reached a state of maximum fidelity with the dataset.
Limitations and Bias
- Specialist, Not a Generalist: This model is a hyper-specialized actor. Its capabilities in other domains (e.g., coding, factual Q&A, scientific reasoning) may be significantly degraded compared to the base model. It prioritizes staying in character over providing factual accuracy.
- Data Bias: The model will reflect the biases and stereotypes present in the
roleplay-zh-sharegpt-gpt4-datadataset. - Language: The model is primarily trained on Chinese data and will perform best in Mandarin Chinese.
Author
This model was trained by aifeifei798. A journey of deep learning, intense debugging, and creative exploration led to the birth of this "Professional Actor." All credit for this excellent LoRA goes to them.
- Downloads last month
- -
Model tree for aifeifei798/roleplayer-actor-lora
Base model
google/gemma-3-4b-pt