Atharva - Fine-tuned Phi-4 Mini
Atharva is an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data. This model is a fine-tuned version of microsoft/Phi-4-mini-instruct, optimized for extraction tasks from DuckDuckGo HTML search results.
Inference with Unsloth
If you saved the LoRA adapter only, you can load and use it with Unsloth for faster inference:
from unsloth import FastLanguageModel
# Load your fine-tuned model (LoRA adapter)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="./fine_tuned_phi_4_mini",
max_seq_length=2048,
load_in_4bit=True, # 4-bit quantization
)
FastLanguageModel.for_inference(model) # Enable faster inference
# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt}
]
inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True), return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, use_cache=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")
Inference with HuggingFace (Merged Model)
If you exported the full merged model, you can load it using standard Hugging Face transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the merged 16-bit model
model_path = "./fine_tuned_phi_4_mini_merged_16bit"
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([inputs], return_tensors="pt").to("cuda")
outputs = model.generate(
**model_inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
use_cache=True,
)
response = tokenizer.decode(outputs[0][model_inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")
Push to Hugging Face Hub (Optional)
If you logged in to Hugging Face and have correctly set the repo_id variable to jaswanthsanjay88/atharva, your model should have been pushed successfully. If not, you can run the push command again:
# This code was already run in a previous cell
# model.push_to_hub_merged("jaswanthsanjay88/atharva", tokenizer, token=True)
To load your model directly from Hugging Face Hub:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("jaswanthsanjay88/atharva")
tokenizer = AutoTokenizer.from_pretrained("jaswanthsanjay88/atharva")
GGUF Export and Ollama Usage
If you exported the model to GGUF format, you can use it with tools like Ollama or LM Studio.
The GGUF file(s) will be located in the ./fine_tuned_phi_4_mini_gguf directory (or ./fine_tuned_phi_4_mini_gguf.zip if you downloaded the archive).
To use with Ollama, create a Modelfile like this (replace model.gguf with your actual GGUF filename):
# Modelfile content
FROM ././fine_tuned_phi_4_mini_gguf/model.gguf
PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|end|>"
# Optionally set a custom system prompt
SYSTEM "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
Then, build and run your model with Ollama:
# Navigate to the directory containing your GGUF model and Modelfile
cd ./fine_tuned_phi_4_mini_gguf
# Create the model
ollama create atharva-phi4-mini -f Modelfile
# Run the model
ollama run atharva-phi4-mini
Note: The quantization method used was q4_k_m.
- Downloads last month
- 26
Model tree for jaswanthsanjay88/atharva
Base model
microsoft/Phi-4-mini-instruct