Atharva - Fine-tuned Phi-4 Mini

Atharva is an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data. This model is a fine-tuned version of microsoft/Phi-4-mini-instruct, optimized for extraction tasks from DuckDuckGo HTML search results.

Inference with Unsloth

If you saved the LoRA adapter only, you can load and use it with Unsloth for faster inference:

from unsloth import FastLanguageModel

# Load your fine-tuned model (LoRA adapter)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="./fine_tuned_phi_4_mini",
    max_seq_length=2048,
    load_in_4bit=True, # 4-bit quantization
)
FastLanguageModel.for_inference(model) # Enable faster inference

# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": user_prompt}
]

inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True), return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, use_cache=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")

Inference with HuggingFace (Merged Model)

If you exported the full merged model, you can load it using standard Hugging Face transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the merged 16-bit model
model_path = "./fine_tuned_phi_4_mini_merged_16bit"
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": user_prompt}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([inputs], return_tensors="pt").to("cuda")

outputs = model.generate(
    **model_inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    use_cache=True,
)
response = tokenizer.decode(outputs[0][model_inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")

Push to Hugging Face Hub (Optional)

If you logged in to Hugging Face and have correctly set the repo_id variable to jaswanthsanjay88/atharva, your model should have been pushed successfully. If not, you can run the push command again:

# This code was already run in a previous cell
# model.push_to_hub_merged("jaswanthsanjay88/atharva", tokenizer, token=True)

To load your model directly from Hugging Face Hub:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("jaswanthsanjay88/atharva")
tokenizer = AutoTokenizer.from_pretrained("jaswanthsanjay88/atharva")

GGUF Export and Ollama Usage

If you exported the model to GGUF format, you can use it with tools like Ollama or LM Studio.

The GGUF file(s) will be located in the ./fine_tuned_phi_4_mini_gguf directory (or ./fine_tuned_phi_4_mini_gguf.zip if you downloaded the archive).

To use with Ollama, create a Modelfile like this (replace model.gguf with your actual GGUF filename):

# Modelfile content
FROM ././fine_tuned_phi_4_mini_gguf/model.gguf
PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|end|>"

# Optionally set a custom system prompt
SYSTEM "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."

Then, build and run your model with Ollama:

# Navigate to the directory containing your GGUF model and Modelfile
cd ./fine_tuned_phi_4_mini_gguf

# Create the model
ollama create atharva-phi4-mini -f Modelfile

# Run the model
ollama run atharva-phi4-mini

Note: The quantization method used was q4_k_m.

Downloads last month
26
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jaswanthsanjay88/atharva

Finetuned
(54)
this model