โ๏ธ LitLex-Llama: Lithuanian Legal AI
LitLex is a specialized Large Language Model (LLM) fine-tuned to understand and interpret the Administrative Code of the Republic of Lithuania (ANK).
Built with Meta Llama 3 and optimized using Unsloth, this model demonstrates high capability in citing legal articles, calculating fines, and explaining regulations in the Lithuanian language.
๐ Model Details
- Model Type: Casual Language Model (LLM)
- Base Model:
unsloth/llama-3-8b-bnb-4bit(Quantized) - Language: Lithuanian ๐ฑ๐น
- Architecture: LoRA (Low-Rank Adaptation)
- Developer: Lukash
- License: MIT
๐ป How to Run (Inference)
You can run this model using the unsloth library for faster inference, or standard transformers.
Installation
pip install unsloth torch transformers
Python Code
from unsloth import FastLanguageModel
import torch
# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "lukashm/LitLex-Llama-LT-v1",
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
# Define the prompt template
alpaca_prompt = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{}
### Response:
"""
# Ask a question
question = "Kokia bauda gresia uลพ greiฤio virลกijimฤ
daugiau kaip 50 km/h?"
inputs = tokenizer([alpaca_prompt.format(question, "")], return_tensors="pt").to("cuda")
# Generate answer
outputs = model.generate(**inputs, max_new_tokens=256, use_cache=True)
response = tokenizer.batch_decode(outputs)[0]
print(response.split("### Response:\n")[1].replace("<|end_of_text|>", ""))
๐ง Training Details
Dataset
The model was fine-tuned on a custom dataset (ank_dataset.json) derived from the Official Administrative Code of Lithuania (ANK) via e-seimas.lrs.lt.
- Size: ~500 high-quality Instruction/Output pairs.
- Content: Specific focus on traffic violations, public order offenses, and administrative fines.
Hyperparameters
- Optimization: Unsloth (QLoRA)
- Steps: 500
- Batch Size: 2 (Gradient Accumulation: 4)
- Learning Rate: 2e-4
- LoRA Rank (r): 64 (High rank for better fact retention)
- Final Loss: ~0.08 (High convergence)
โ ๏ธ Limitations & Disclaimer
This model is a Proof of Concept (MVP) intended for educational and research purposes.
- Hallucinations: While highly accurate in style, the model may occasionally cite incorrect article numbers (e.g., confusing Art. 348 with Art. 416).
- Scope: The model specializes in Administrative Law (ANK) and may not be aware of Criminal Code (BK) or Civil Code (CK) nuances unless specifically trained.
- Legal Advice: This is an AI assistant, not a lawyer. Always consult official sources or a qualified attorney for legal matters.
- Downloads last month
- 11
Model tree for lukashm/LitLex-Llama-LT-v1
Base model
meta-llama/Meta-Llama-3-8B
Quantized
unsloth/llama-3-8b-bnb-4bit