ManjunathCode10x's picture
Update README.md
da90eb2 verified
metadata
license: mit
language:
  - en
metrics:
  - perplexity
  - accuracy
base_model:
  - microsoft/Phi-3-mini-4k-instruct
library_name: transformers
tags:
  - legal

🧑‍⚖️ vakil-phi3-mini-4k-instruct-finetuned

📌 Overview

This repository hosts a fine‑tuned version of microsoft/Phi‑3‑mini‑4k‑instruct (3.8B parameters), adapted specifically for Indian legal knowledge tasks.
The model was instruction‑tuned using LoRA (Low‑Rank Adaptation) on curated datasets covering constitutional acts and statutory sections.
The objective of this fine‑tuning was to enhance the model’s ability to deliver accurate, contextual, and explainable outputs for legal queries in the Indian domain.


⚙️ Training Details

  • Base Model: microsoft/Phi‑3‑mini‑4k‑instruct
  • Fine‑Tuning Method: LoRA (parameter‑efficient fine‑tuning)
  • Domain Data: Indian constitutional acts, statutory sections, and related legal texts
  • Training Infrastructure: RunPod RTX A6000 GPU
  • Training Duration: 18 hours, 2 epochs
  • Optimization Goal: Reduce training loss and improve domain‑specific accuracy

📊 Evaluation

  • Intrinsic Evaluation:

    • Reduced perplexity compared to the base model
    • Improved accuracy on domain‑specific test sets
  • Extrinsic Evaluation:

    • Better parsing of statutes and structured legal outputs
    • Enhanced contextual reasoning in legal Q&A tasks
  • Qualitative Observations:

    • More consistent responses when asked about constitutional provisions
    • Improved ability to generate structured JSON outputs for legal sections

🚀 Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ManjunathCode10x/vakil-phi3-mini-4k-instruct-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

inputs = tokenizer("Explain Article 21 of the Indian Constitution:", return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))