Instructions to use PillowTa1k/NaviGen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use PillowTa1k/NaviGen with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("NaviGen-stage2-base") model = PeftModel.from_pretrained(base_model, "PillowTa1k/NaviGen") - Transformers
How to use PillowTa1k/NaviGen with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="PillowTa1k/NaviGen") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("PillowTa1k/NaviGen", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use PillowTa1k/NaviGen with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "PillowTa1k/NaviGen" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PillowTa1k/NaviGen", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/PillowTa1k/NaviGen
- SGLang
How to use PillowTa1k/NaviGen with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "PillowTa1k/NaviGen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PillowTa1k/NaviGen", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "PillowTa1k/NaviGen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PillowTa1k/NaviGen", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use PillowTa1k/NaviGen with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for PillowTa1k/NaviGen to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for PillowTa1k/NaviGen to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for PillowTa1k/NaviGen to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="PillowTa1k/NaviGen", max_seq_length=2048, ) - Docker Model Runner
How to use PillowTa1k/NaviGen with Docker Model Runner:
docker model run hf.co/PillowTa1k/NaviGen
NaviGen GRPO Adapter - step600
This repository contains the GRPO-trained LoRA adapter used by NaviGen, a personalized generative recommendation model for producing user-aware image and video generation instructions.
NaviGen represents each item with a dual identifier that couples a collaborative code and a textual code in one token stream. This adapter is the reinforcement learning stage of the NaviGen pipeline: it further aligns the stage-2 supervised model with user intent through reward-guided optimization.
Model Details
- Model name: NaviGen GRPO Adapter, step600
- Model type: PEFT LoRA adapter for causal language modeling
- Base model:
NaviGen-stage2-base - Backbone family: Qwen3-style causal LM
- Training stage: GRPO reinforcement learning after two-stage SFT
- Adapter format:
adapter_model.safetensors - PEFT version: 0.19.1
The adapter targets the main attention and MLP projection layers:
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Intended Use
This adapter is intended for research on personalized generative recommendation, especially settings where a model should infer user preference from historical item identifiers and produce more specific, relevant, and visually generatable generation instructions.
Typical uses include:
- Personalized prompt or instruction generation for image/video models
- Next-item or identifier prediction under the NaviGen token format
- Reproduction and analysis of the NaviGen RL stage
- Ablation studies comparing SFT and GRPO-aligned checkpoints
This adapter is not a standalone model. It must be loaded on top of the corresponding NaviGen stage-2 base model.
Quick Start
Install the main dependencies:
pip install torch transformers peft safetensors
Load the adapter with PEFT:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_id = "NaviGen-stage2-base"
adapter_id = "NaviGen-grpo-step600"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()
Replace base_model_id and adapter_id with the final repository names used in your release.
Input Format
The adapter follows the NaviGen training format. Inputs should use the same tokenizer and special tokens released with this checkpoint. In general, prompts contain user history, item identifiers, and task instructions serialized in the NaviGen token stream.
For reproducibility, use the tokenizer files included in this repository:
tokenizer.jsontokenizer_config.jsonspecial_tokens_map.jsonadded_tokens.jsonchat_template.jinjavocab.jsonmerges.txt
Training Summary
NaviGen uses a two-stage SFT + RL pipeline:
- Stage-1 SFT: learns item identifier and preference-aware representations.
- Stage-2 SFT: distills preference reasoning and instruction writing from searched supervision.
- GRPO alignment: optimizes the model with hierarchical and self-consistent rewards to better match user intent and generation quality.
This checkpoint corresponds to the GRPO adapter saved at training step 600.
Limitations
- The adapter depends on the matching NaviGen base model and tokenizer.
- Outputs are sensitive to the exact prompt format and identifier vocabulary.
- The model is designed for research use and has not been audited for all production safety requirements.
- Generated instructions may still contain irrelevant, underspecified, or visually difficult content.
Files
Core files for inference:
adapter_config.jsonadapter_model.safetensors- tokenizer and chat template files
Training-resume states such as optimizer or scheduler checkpoints are not required for normal inference.
Citation
If you use this model, please cite the NaviGen paper once the citation is released.
@article{navigen,
title = {NaviGen: Personalized Generative Recommendation with Dual Identifiers},
author = {NaviGen Authors},
journal = {TBA},
year = {2026}
}
- Downloads last month
- 21