Model Card for LLaMA-2-7B-Chat Fine-tuned on Pakistani Legal Q&A Dataset (QLoRA)

Model Details

Model Description

This repository contains LoRA adapter weights for LLaMA-2-7B-Chat, fine-tuned on a Pakistani legal Q&A dataset using QLoRA (4-bit quantization).

The model is intended for legal information retrieval and educational purposes only.
⚠️ It should not be used as a substitute for professional legal advice.


πŸš€ Quick Start (Google Colab)

You need a GPU runtime (preferably T4 or higher).
In Colab, go to:
Runtime ➝ Change runtime type ➝ Select T4\GPU.

  • Base model: meta-llama/Llama-2-7b-chat-hf
  • Fine-tuning method: QLoRA (4-bit quantization)
  • Framework: Hugging Face Transformers + PEFT
  • Primary purpose: Legal information and education (non-advisory)

Model Sources


Uses

Direct Use

  • Legal information retrieval
  • Educational purposes: Understanding Pakistani laws, procedures, and definitions

Downstream Use

  • Integration into legal research assistants
  • Support in law-related educational tools

Out-of-Scope Use

  • Real legal decision-making
  • Providing confidential legal advice
  • Any non-Pakistani law domain queries

Dataset

  • Source: Collected from official Pakistani government websites hosting public legal documents, acts, and regulations.
  • Format: Converted from PDF to structured Q&A format (Dataset.csv).
  • Contents: Includes questions about legal definitions, processes, and roles as per Pakistani law.
  • Size: 1941 rows
  • Language: English

Bias, Risks, and Limitations

  • The model’s knowledge is limited to the dataset scope and law versions at the time of collection
  • May hallucinate answers for out-of-domain or ambiguous queries
  • Not updated for recent law amendments unless retrained

Recommendations

Users must verify answers against official legal sources before acting upon them.


License

  • Base model: LLaMA-2 license by Meta
  • Dataset: Public government documents (open to public use, verify each source)

Ethical Considerations & Risks

  • Do not use for real legal decision-making.
  • May misinterpret complex or ambiguous legal terms.
  • Should not replace a qualified lawyer or legal expert.

Evaluation

Example Usage

  • Q: What is the significance of Article 181 of the Limitation Act, 1908, in relation to applications filed under various statutes, as interpreted by the High Court in this case?
  • A: Article 181 of the Limitation Act, 1908, is significant because it provides a general rule for the computation of time for filing applications, including those under various statutes. The High Court's interpretation of this article, as seen in the case of "Mst. Naseem Bibi v. Mst. Hameeda Bibi", is that the limitation period begins to run on the day the application is made, rather than on the date of the event or occurrence that triggered the application. This interpretation ensures that applications are filed within the prescribed time frame, and it highlights the importance of considering the specific provision and context of each statute when determining the applicable limitation period.

Citation

@misc{fizza2025paklawqlora,

  • author = {Fizza Arif},
  • title = {LLaMA-2-7B-Chat fine-tuned on Pakistani Legal Q&A Dataset (QLoRA)},
  • year = {2025},
  • publisher = {Hugging Face}, }

How to Get Started with the Model

This is the fine-tuned model on HF only contains the LoRA adapter weights. When you try to from_pretrained(...), πŸ€— Transformers automatically tries to fetch the base model: meta-llama/Llama-2-7b-chat-hf. That base repo is gated (you need to request access on Hugging Face) and also you must be logged in with your HF token.


from huggingface_hub import login
login(token="YourHFTokenHere")

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model = "meta-llama/Llama-2-7b-chat-hf"   # gated model
adapter_model = "fizzarif7/llama2_pklaw_gpt"   # your LoRA fine-tuned repo

# Load tokenizer ---
tokenizer = AutoTokenizer.from_pretrained(base_model)

# quantization to save VRAM ---
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16)

# Load base model ---
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto"
)

# --- Load LoRA adapter ---
model = PeftModel.from_pretrained(model, adapter_model)

# Examplpe Usage
prompt = "What is the importance of the general manager under Pakistani corporate law?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))


Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 2 Ask for provider support

Model tree for fizzarif7/llama2_pklaw_gpt

Adapter
(1199)
this model