Text Generation
PEFT
Safetensors
Transformers
Arabic
English
lora
sft
trl
unsloth
vision-language-model
vlm
legal
json-extraction
arabic
conversational
Instructions to use Humachine/egypt-constitution-vlm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Humachine/egypt-constitution-vlm with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-3-4b-it-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "Humachine/egypt-constitution-vlm") - Transformers
How to use Humachine/egypt-constitution-vlm with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Humachine/egypt-constitution-vlm") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Humachine/egypt-constitution-vlm", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Humachine/egypt-constitution-vlm with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Humachine/egypt-constitution-vlm" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Humachine/egypt-constitution-vlm", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Humachine/egypt-constitution-vlm
- SGLang
How to use Humachine/egypt-constitution-vlm with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Humachine/egypt-constitution-vlm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Humachine/egypt-constitution-vlm", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Humachine/egypt-constitution-vlm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Humachine/egypt-constitution-vlm", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use Humachine/egypt-constitution-vlm with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Humachine/egypt-constitution-vlm to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Humachine/egypt-constitution-vlm to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Humachine/egypt-constitution-vlm to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Humachine/egypt-constitution-vlm", max_seq_length=2048, ) - Docker Model Runner
How to use Humachine/egypt-constitution-vlm with Docker Model Runner:
docker model run hf.co/Humachine/egypt-constitution-vlm
| base_model: unsloth/gemma-3-4b-it-unsloth-bnb-4bit | |
| library_name: peft | |
| pipeline_tag: text-generation | |
| tags: | |
| - base_model:adapter:unsloth/gemma-3-4b-it-unsloth-bnb-4bit | |
| - lora | |
| - sft | |
| - transformers | |
| - trl | |
| - unsloth | |
| - vision-language-model | |
| - vlm | |
| - legal | |
| - json-extraction | |
| - arabic | |
| language: | |
| - ar | |
| - en | |
| # Egypt Constitution VLM | |
| This model is a fine-tuned Vision-Language Model (VLM) based on **Gemma-3-4B-IT** (quantized to 4-bit via Unsloth). It is designed to extract highly structured JSON data—including constitutional articles, page metadata, legal intent, and named entities—directly from scanned images of Arabic constitutional and legal documents. | |
| ## Model Details | |
| ### Model Description | |
| This model leverages Parameter-Efficient Fine-Tuning (PEFT) using LoRA on the Gemma-3 architecture. By processing image inputs of scanned legal documents alongside specific instructions, it accurately transcribes Arabic text while simultaneously structuring the output into a predefined JSON schema. Vision layers, Language layers, Attention modules, and MLP modules were all targeted during the fine-tuning process. | |
| * **Developed by:** [Mahmoud Essam] | |
| * **Model type:** Vision-Language Model (VLM) with LoRA Adapters | |
| * **Language(s) (NLP):** Arabic (content extraction), English (JSON keys/schema) | |
| * **License:** Apache 2.0 (Inherited from Gemma-3) | |
| * **Finetuned from model:** `unsloth/gemma-3-4b-it-unsloth-bnb-4bit` | |
| ## Uses | |
| ### Direct Use | |
| The primary use case is the digitization and structured data extraction from scanned Arabic legal documents (specifically the Egyptian Constitution). By providing a document image as input, the model outputs a structured JSON object containing: | |
| * **Page Metadata:** Source document, page number, language. | |
| * **Hierarchy Context:** Part and Chapter titles. | |
| * **Articles:** raw text, cleaned body text, legal intent, key entities, Arabic summaries, and keywords. | |
| ### Out-of-Scope Use | |
| * Recognition of highly illegible handwritten Arabic documents. | |
| * General-purpose conversational AI or chat tasks (it is highly specialized for JSON extraction). | |
| * Processing documents in languages other than Arabic. | |
| ## Bias, Risks, and Limitations | |
| * **Domain Specificity:** The model is heavily optimized for formal Arabic legal and constitutional texts. It may hallucinate or underperform on standard conversational Arabic or vastly different document layouts (e.g., newspapers, unstructured letters). | |
| * **Resolution Sensitivity:** The model relies heavily on a specific image preprocessing pipeline (resizing and padding to `1024x1024`). Feeding raw, unformatted images may degrade performance. | |
| ### Recommendations | |
| Users should ensure that input images undergo the identical preprocessing steps used during training (Grayscale, Autocontrast, Denoising, Sharpening, and Padding) to achieve optimal extraction accuracy. Human-in-the-loop verification is recommended for critical legal digitization tasks. | |
| ## How to Get Started with the Model | |
| Use the separated code blocks below to get started with the model using Unsloth. | |
| ### Getting Model | |
| ```python | |
| import torch | |
| import json | |
| from PIL import Image, ImageOps, ImageFilter | |
| from unsloth import FastVisionModel | |
| # 1. Load Model & Tokenizer | |
| model_id = "Humachine/egypt-constitution-vlm" | |
| model, tokenizer = FastVisionModel.from_pretrained( | |
| model_name=model_id, | |
| load_in_4bit=True, | |
| trust_remote_code=True | |
| ) | |
| ``` | |
| ### Image Preprocessing Logic | |
| ```python | |
| def preprocess_image(image_path: str, target_size: tuple = (1024, 1024)) -> Image.Image: | |
| image = Image.open(image_path).convert('L') | |
| image = ImageOps.autocontrast(image, cutoff=1) | |
| image = image.filter(ImageFilter.MedianFilter(size=3)) | |
| image = image.filter(ImageFilter.SHARPEN) | |
| image.thumbnail(target_size, Image.Resampling.LANCZOS) | |
| padded = Image.new('L', target_size, color=255) | |
| offset = ((target_size[0] - image.width) // 2, (target_size[1] - image.height) // 2) | |
| padded.paste(image, offset) | |
| return padded.convert('RGB') | |
| ``` | |
| ### Prepare Inputs | |
| ```python | |
| image_path = "/content/0100.jpg" | |
| image = preprocess_image(image_path) | |
| instruction = ( | |
| "Extract all articles from this Arabic constitutional document. " | |
| "Return a JSON object with keys: page_metadata, hierarchy_context, and articles. " | |
| "Each article must include: article_id, article_number, content " | |
| "(body_text, key_entities, legal_intent), training_features (summary_ar, keywords), " | |
| "and text_raw." | |
| ) | |
| messages = [ | |
| {"role": "user", "content": [{"type": "image", "image": image}, {"type": "text", "text": instruction}]} | |
| ] | |
| ``` | |
| ### Generate Output | |
| ```python | |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| inputs = tokenizer(text, return_tensors="pt").to("cuda",dtype=torch.bfloat16) | |
| output_tokens = model.generate(**inputs, max_new_tokens=2048, use_cache=True, temperature=0.2) | |
| output_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True) | |
| print(output_text) | |
| ``` | |
| ## Citation | |
| If you use this model in your research or application, please cite it as follows: | |
| ```json | |
| @misc{essam2026egypt, | |
| author = {Mahmoud Essam}, | |
| title = {Egypt Constitution VLM: A Vision-Language Model for Arabic Legal JSON Extraction}, | |
| journal = {Hugging Face Repositories}, | |
| year = {2026}, | |
| url = {https://huggingface.co/your-username/egypt-constitution-vlm} | |
| } | |
| ``` |