Instructions to use prhegde/t5-query-reformulation-RL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prhegde/t5-query-reformulation-RL with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prhegde/t5-query-reformulation-RL")

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("prhegde/t5-query-reformulation-RL")
model = AutoModelForSeq2SeqLM.from_pretrained("prhegde/t5-query-reformulation-RL")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prhegde/t5-query-reformulation-RL with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prhegde/t5-query-reformulation-RL"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prhegde/t5-query-reformulation-RL",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/prhegde/t5-query-reformulation-RL

SGLang

How to use prhegde/t5-query-reformulation-RL with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prhegde/t5-query-reformulation-RL" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prhegde/t5-query-reformulation-RL",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prhegde/t5-query-reformulation-RL" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prhegde/t5-query-reformulation-RL",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use prhegde/t5-query-reformulation-RL with Docker Model Runner:
```
docker model run hf.co/prhegde/t5-query-reformulation-RL
```

YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Model Summary

This is a generative model designed specifically for search query rewriting, employing a sequence-to-sequence architecture for generating reformulated queries. It leverages a Reinforcement Learning framework to further boost performance, integrating a policy gradient algorithm. The model is trained with reward functions aimed at diversifying the generated queries by paraphrasing keywords. It can be integrated with sparse retrieval methods, such as bm25-based retrieval, to enhance document recall in search.

Intended use cases

Query rewriting for search (web, e-commerce), Virtual assistants and chatbots, Information retrieval

Model Description

Training Procedure

The training process begins by initializing the sequence-to-sequence model with Google's T5-base model .
Initially, the model undergoes supervised training using the MS-MARCO query pairs dataset
Subsequently, the model is fine-tuned using a reinforcement learning (RL) framework to enhance its ability to generate queries that are both diverse and relevant.
It uses a policy gradient approach to fine-tune the model. For a given input query, a set of trajectories (reformulated queries) are sampled from the model and reward is computed. Policy gradient algorithm is applied to update the model.
Rewards are heuristically computed to enhance the model's paraphrasing capability. However, these rewards can be substituted with other domain-specific or goal-specific reward functions as needed.

Refer here for more details.

Model Sources

Repository: https://github.com/PraveenSH/RL-Query-Reformulation

How to use

For optimal utilization of this model, use sampling with repetition penalty to generate diverse samples. Below is the provided sample code.

import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer

MODEL_ID = "prhegde/t5-query-reformulation-RL"

tokenizer = T5Tokenizer.from_pretrained(MODEL_ID)
model = T5ForConditionalGeneration.from_pretrained(MODEL_ID)
model.eval()

input_sequence = "how to bake great cookie"
input_ids = tokenizer(input_sequence, return_tensors="pt").input_ids
print(f'Input: {input_sequence}')

nsent = 4
with torch.no_grad():
    for i in range(nsent):
        output = model.generate(input_ids, max_length=35, num_beams=1, do_sample=True, repetition_penalty=1.8)
        target_sequence = tokenizer.decode(output[0], skip_special_tokens=True)
        print(f'Target: {target_sequence}')

Downloads last month: 20

Safetensors

Model size

0.2B params

Tensor type

F32

prhegde
/

t5-query-reformulation-RL

Model Summary

Intended use cases

Model Description

Model Sources

How to use

Dataset used to train prhegde/t5-query-reformulation-RL