Phi-2 QLoRA Fine-tuned Assistant

This is a fine-tuned version of Microsoft's Phi-2 model using QLoRA (Quantized Low-Rank Adaptation). The model has been trained to provide helpful responses for various tasks including coding, writing, and general assistance.

Model Details

Base Model: Microsoft Phi-2 (2.7B parameters)
Fine-tuning Method: QLoRA (4-bit quantization)
Training Data: Custom dataset focused on programming and professional communication
Hardware Used: NVIDIA RTX 4090 (24GB VRAM)

Usage

You can interact with the model through the Gradio interface by visiting the "Spaces" tab of this repository.

Local Installation

To run the model locally:

Clone this repository
Install dependencies:

pip install -r requirements.txt

Run the Gradio app:

python gradio_app.py

Parameters

Max Length: Controls the maximum length of the generated response (64-1024 tokens)
Temperature: Controls randomness in generation (0.1-1.0)
Top P: Controls diversity of generated responses (0.1-1.0)

Example Prompts

"Write a Python function to calculate the factorial of a number"
"Explain the concept of machine learning in simple terms"
"Write a professional email requesting a meeting with a client"

Limitations

The model works best with English language input
Response quality may vary depending on the complexity of the prompt
Maximum context length is limited to 2048 tokens

License

This model is subject to the Microsoft Phi-2 license terms and conditions.

Acknowledgments

Microsoft for the Phi-2 base model
Hugging Face for the transformers library and model hosting
The QLoRA paper authors for the quantization technique

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

pradeep6kumar2024
/

phi2-qlora-assistant