LegalParam GGUF Models

GGUF quantized versions of bharatgenai/LegalParam for use with Ollama.

Model Information

Original Model: bharatgenai/LegalParam

  • Architecture: ParamBharatGen (LLaMA-based)
  • Parameters: 2.9B
  • Context Length: 2048 tokens
  • Purpose: Specialized AI assistant for Indian law

Available Quantizations

Quantization File Size Description Use Case
Q4_K_M 1.7GB 4-bit quantized Recommended for most use cases
Q6_K 2.2GB 6-bit quantized Higher quality, moderate resource usage
F16 5.4GB 16-bit float (no quantization) Highest quality, requires more memory

Quick Start

1. Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

2. Create the Model

Choose a quantization level:

# Q4_K_M (Recommended - 1.7GB)
ollama create legalparam:q4 -f Modelfile

# Q6_K (Higher quality - 2.2GB)
ollama create legalparam:q6 -f Modelfile-q6

# F16 (Highest quality - 5.4GB)
ollama create legalparam:f16 -f Modelfile-f16

3. Run the Model

# Interactive chat
ollama run legalparam:q4

# Single query
ollama run legalparam:q4 "What steps should a farmer take to legally transfer agricultural land ownership?"

Python Usage

from ollama import Client

client = Client()

response = client.chat(model='legalparam:q4', messages=[
  {'role': 'user', 'content': 'What are the fundamental rights in the Indian Constitution?'}
])

print(response['message']['content'])

Model File Details

All Modelfiles include:

  • Correct chat template matching the tokenizer's format
  • Stop tokens (</s>, <user>, <assistant>) to prevent infinite generation loops
  • Optimized parameters for legal question answering

Chat Template Format

<user>
{user_message}
<assistant>
{assistant_response}

Context Window

  • Default: 2048 tokens (combined input + output)
  • Scaling: Can be extended with RoPE scaling in Ollama (experimental)

Example Queries

The model excels at Indian legal queries:

  • "Explain the First Amendment of the Indian Constitution"
  • "What is the procedure for filing a civil suit in India?"
  • "What are the key provisions of the Land Acquisition Act?"
  • "Explain the concept of judicial review in India"
  • "What are the powers of the Supreme Court of India?"

Technical Specifications

Model Architecture

  • Hidden size: 2048
  • Layers: 32
  • Attention heads: 16
  • KV heads: 8 (Grouped Query Attention)
  • Vocabulary: 256,006 tokens

Special Tokens

  • <s>: Beginning of sequence (BOS)
  • </s>: End of sequence (EOS)
  • <user>: User message marker
  • <assistant>: Assistant message marker

Limitations

  • Context limited to 2048 tokens
  • Training data cutoff: August 2023
  • Optimized for Indian law queries
  • May not perform well on non-legal topics

Original Model

This is a quantized version of bharatgenai/LegalParam. For the original PyTorch model, training details, and full documentation, please refer to the original repository.

License

Please refer to the original model repository for licensing information.

Conversion Process

These models were converted from the original HuggingFace format to GGUF using llama.cpp with the following process:

  1. Loaded original model with transformers
  2. Converted to GGUF format
  3. Quantized to Q4_K_M, Q6_K, and F16 precision
  4. Validated with Ollama inference engine

Troubleshooting

Model repeats or loops

  • Ensure you're using the provided Modelfiles
  • Stop tokens are pre-configured to prevent infinite loops

Out of memory errors

  • Try a smaller quantization (Q4_K_M instead of Q6_K)
  • Reduce num_ctx parameter in Ollama

Poor quality responses

  • Try F16 quantization for highest quality
  • Ensure proper prompt formatting with <user> and <assistant> tags

Acknowledgments

Downloads last month
3,925
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for parthashirolkar/LegalParam-GGUF

Quantized
(1)
this model