nprak26
/

IndicLaw-Class

Text Classification

Model card Files Files and versions

IndicLaw-Class / README.md

nprak26's picture

Update README.md

d109545 verified 5 months ago

|

history blame contribute delete

3.38 kB

	---
	license: apache-2.0
	language:
	- en
	- kn
	metrics:
	- accuracy
	base_model:
	- distilbert/distilbert-base-multilingual-cased
	pipeline_tag: text-classification
	tags:
	- legal
	- code-mixed
	---

	# IndicLaw-Class: Code-Mixed Legal Intent Classifier

	`IndicLaw-Class` is a lightweight multilingual transformer-based classifier that identifies legal intent from code-mixed Indian queries (e.g., Kannada-English, Hinglish). It is fine-tuned on citizen-style queries for real-world legal triage applications.

	---

	## Model Overview

	- Architecture: [`distilbert-base-multilingual-cased`](https://huggingface.co/distilbert-base-multilingual-cased)
	- Task: Multi-class text classification (6 legal categories)
	- Input Style: Informal, code-mixed queries like:
	- `divorce file maadbeku without husband consent`
	- `builder flat delay case haakbeku`
	- `rent refund maadbeku, owner refusing`

	---

	## Legal Categories

	The model classifies input into one of the following categories:

	\| Label \| Description \|
	\|------------------\|------------------------------------\|
	\| Family Law \| Divorce, custody, alimony, marriage \|
	\| Property Law \| Inheritance, land disputes, transfer \|
	\| Criminal Law \| FIRs, police misconduct, assault \|
	\| Consumer Complaints \| E-commerce, refund issues, builders \|
	\| Rent & Tenancy \| Eviction, deposit disputes, lease \|
	\| Public Services \| Certificates, ID updates, ration \|

	---

	## Environmental Impact
	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact/#compute) presented in Lacoste et al. (2019).

	Hardware Type: More information needed
	Hours used: More information needed
	Cloud Provider: More information needed
	Compute Region: More information needed
	Carbon Emitted: More information needed

	---

	## Citation

	```bibtex
	@misc{nishanth_prakash_2025,
	author = { nishanth prakash },
	title = { IndicLaw-Class (Revision 87ae96e) },
	year = 2025,
	url = { https://huggingface.co/nprak26/IndicLaw-Class },
	doi = { 10.57967/hf/5964 },
	publisher = { Hugging Face }
	}
	```
	---
	## How to Get Started With the Model

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
	import torch

	# Load model and tokenizer from your local folder
	model_dir = "./indiclaw-classifier"

	tokenizer = AutoTokenizer.from_pretrained(model_dir)
	model = AutoModelForSequenceClassification.from_pretrained(model_dir)

	# Load label map (from labels.txt you saved earlier)
	label_map = {}
	with open(f"{model_dir}/labels.txt", "r") as f:
	for line in f:
	idx, label = line.strip().split("\t")
	label_map[int(idx)] = label

	# Create pipeline
	classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

	# Test inputs
	examples = [
	"wife divorce file maadbeku",
	"flat possession delay aadmele builder case file madbeku",
	"tenant evict maadbeku no notice"
	]

	# Run predictions
	for text in examples:
	result = classifier(text)[0]
	label_str = result["label"]
	if "label" in label_str.lower():
	label_id = int(label_str.split("_")[-1])
	else:
	label_id = int(label_str)
	label_name = label_map[label_id]
	print(f"Input: {text}\nPredicted: {label_name} (confidence: {result['score']:.2f})\n")


	---