prdev
/

query-gen

 - en
 ---
+# Query Generation with LoRA Finetuning
+This project fine-tunes a language model using supervised fine-tuning (SFT) and LoRA adapters to generate queries from documents. The model was trained on the [`prdev/qtack-gq-embeddings-unsupervised`](https://huggingface.co/datasets/prdev/qtack-gq-embeddings-unsupervised) dataset using an A100 GPU.
+## Overview
+- **Objective:**
+  The goal is to train a model that, given a document, generates a relevant query. Each training example is formatted with custom markers:
+  - `<|document|>\n` precedes the document text.
+  - `<|query|>\n` precedes the query text.
+  - An EOS token is appended at the end to signal termination.
+- **Text Chunking:**
+  For optimal performance, **chunk your text** into smaller, coherent pieces before providing it to the model. Long documents can lead the model to focus on specific details rather than the overall context.
+- **Training Setup:**
+  The model is fine-tuned using the Unsloth framework with LoRA adapters, taking advantage of an A100 GPU for efficient training.
+## Quick Usage
+Below is an example code snippet to load the finetuned model and test it with a chunked document:
+```python
+from unsloth import FastLanguageModel
+from transformers import TextStreamer
+# Load the finetuned model and tokenizer from Hugging Face Hub.
+# Replace 'your_username/your_model_repo_name' with your actual model repository.
+model, tokenizer = FastLanguageModel.from_pretrained("your_username/your_model_repo_name", load_in_4bit=True)
+# Enable faster inference if supported.
+FastLanguageModel.for_inference(model)
+# Example document chunk (ensure text is appropriately chunked).
+document_chunk = (
+    "liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge "
+    "and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects."
+)
+# Create the prompt using custom markers.
+prompt = (
+    "<|document|>\n" + document_chunk + "\n<|query|>\n"
+)
+# Tokenize the prompt.
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+# Set up a TextStreamer to view token-by-token generation.
+streamer = TextStreamer(tokenizer, skip_prompt=True)
+# Generate a query from the document.
+_ = model.generate(
+    input_ids=inputs["input_ids"],
+    streamer=streamer,
+    max_new_tokens=100,
+    temperature=0.7,
+    min_p=0.1,
+    eos_token_id=tokenizer.eos_token_id,  # Ensures proper termination.
+)
 # Uploaded  model
 - **Developed by:** prdev