Ken1.0-67B - Russian Legal AI Model

Ken1.0-67B is a specialized large language model (67 billion parameters) engineered for the Russian legal domain. Unlike general-purpose models, Ken1.0 was finetuned on a massive proprietary dataset containing over 30 million legal documents, including the Civil, Criminal, and Labor Codes of the Russian Federation, as well as extensive court practice records.

This model is designed to bridge the gap between generic AI and professional legal expertise.

Key Capabilities

Deep Legal Understanding: Interprets complex Russian legal terminology and context
Advisory: Drafts preliminary consultations on civil, criminal, and labor law
Document Analysis: Can process and summarize legal texts
Code Navigation: Provides references to specific articles and regulations

Technical Specifications

Parameters: 67 billion
Language: Russian
Context Window: 32,768 tokens (~24,000 words)
Precision: FP16
Model Size: 135 GB
Hardware Requirements:
- GPU: A100 (80GB) or H100 recommended for inference
- CPU: 200GB+ RAM for CPU inference
Framework: PyTorch, Transformers

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "KenanKoiushov/Ken1.0-67B",
    device_map="auto",
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("KenanKoiushov/Ken1.0-67B")

# Prepare messages
messages = [
    {"role": "system", "content": "Вы - юридический консультант по российскому праву."},
    {"role": "user", "content": "Какой срок исковой давности по гражданским делам?"}
]

# Generate response
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

Base Architecture: Transformer-based decoder model
Training Data: 30M+ Russian legal documents
- Civil Code of the Russian Federation
- Criminal Code of the Russian Federation
- Labor Code of the Russian Federation
- Court practice and case law
- Legal commentary and analysis
Specialization: Extensive training on Russian legal corpus

Performance Notes

GPU Inference: ~2-5 seconds per response (A100 80GB)
CPU Inference: 5-15 minutes per response (36+ cores recommended)
Quantization: 4-bit/8-bit quantization supported for reduced memory footprint

Limitations

This model is designed for informational and educational purposes only
Not a substitute for professional legal advice
May occasionally generate incorrect or outdated legal information
Always consult qualified legal professionals for actual legal matters
Trained primarily on Russian Federation law (as of training date)

License

This model is released for research and educational purposes. Commercial use requires separate licensing.

Disclaimer

Ken1.0-67B provides general legal information and should not be considered as professional legal advice. The model's outputs do not establish an attorney-client relationship. Users should always verify information with qualified legal professionals before making legal decisions.

Citation

If you use this model in your research, please cite:

@misc{ken1.0-67b,
  title={Ken1.0-67B: A Specialized Legal Language Model for Russian Law},
  author={Koiushov, Kenan},
  year={2025},
  publisher={HuggingFace},
  url={https://huggingface.co/KenanKoiushov/Ken1.0-67B}
}

Contact

For questions, feedback, or commercial licensing inquiries, please open an issue on the model repository.

Model Version: 1.0
Release Date: December 2025
Status: Production Ready

Downloads last month: 9

Safetensors

Model size

73B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KenanKoiushov/Ken1.0-67B

Quantizations

2 models