Instructions to use Humachine/egypt-constitution-vlm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Humachine/egypt-constitution-vlm with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-3-4b-it-unsloth-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "Humachine/egypt-constitution-vlm")

Transformers

How to use Humachine/egypt-constitution-vlm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Humachine/egypt-constitution-vlm")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Humachine/egypt-constitution-vlm", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Humachine/egypt-constitution-vlm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Humachine/egypt-constitution-vlm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Humachine/egypt-constitution-vlm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Humachine/egypt-constitution-vlm

SGLang

How to use Humachine/egypt-constitution-vlm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Humachine/egypt-constitution-vlm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Humachine/egypt-constitution-vlm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Humachine/egypt-constitution-vlm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Humachine/egypt-constitution-vlm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use Humachine/egypt-constitution-vlm with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Humachine/egypt-constitution-vlm to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Humachine/egypt-constitution-vlm to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Humachine/egypt-constitution-vlm to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Humachine/egypt-constitution-vlm",
    max_seq_length=2048,
)

Docker Model Runner
How to use Humachine/egypt-constitution-vlm with Docker Model Runner:
```
docker model run hf.co/Humachine/egypt-constitution-vlm
```

egypt-constitution-vlm / README.md

Humachine

Update README.md

f86fbdd verified about 1 month ago

preview code

raw

history blame contribute delete

5.39 kB

	---
	base_model: unsloth/gemma-3-4b-it-unsloth-bnb-4bit
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- base_model:adapter:unsloth/gemma-3-4b-it-unsloth-bnb-4bit
	- lora
	- sft
	- transformers
	- trl
	- unsloth
	- vision-language-model
	- vlm
	- legal
	- json-extraction
	- arabic
	language:
	- ar
	- en
	---

	# Egypt Constitution VLM

	This model is a fine-tuned Vision-Language Model (VLM) based on Gemma-3-4B-IT (quantized to 4-bit via Unsloth). It is designed to extract highly structured JSON data—including constitutional articles, page metadata, legal intent, and named entities—directly from scanned images of Arabic constitutional and legal documents.

	## Model Details

	### Model Description

	This model leverages Parameter-Efficient Fine-Tuning (PEFT) using LoRA on the Gemma-3 architecture. By processing image inputs of scanned legal documents alongside specific instructions, it accurately transcribes Arabic text while simultaneously structuring the output into a predefined JSON schema. Vision layers, Language layers, Attention modules, and MLP modules were all targeted during the fine-tuning process.

	* Developed by: [Mahmoud Essam]
	* Model type: Vision-Language Model (VLM) with LoRA Adapters
	* Language(s) (NLP): Arabic (content extraction), English (JSON keys/schema)
	* License: Apache 2.0 (Inherited from Gemma-3)
	* Finetuned from model: `unsloth/gemma-3-4b-it-unsloth-bnb-4bit`

	## Uses

	### Direct Use

	The primary use case is the digitization and structured data extraction from scanned Arabic legal documents (specifically the Egyptian Constitution). By providing a document image as input, the model outputs a structured JSON object containing:
	* Page Metadata: Source document, page number, language.
	* Hierarchy Context: Part and Chapter titles.
	* Articles: raw text, cleaned body text, legal intent, key entities, Arabic summaries, and keywords.

	### Out-of-Scope Use

	* Recognition of highly illegible handwritten Arabic documents.
	* General-purpose conversational AI or chat tasks (it is highly specialized for JSON extraction).
	* Processing documents in languages other than Arabic.

	## Bias, Risks, and Limitations

	* Domain Specificity: The model is heavily optimized for formal Arabic legal and constitutional texts. It may hallucinate or underperform on standard conversational Arabic or vastly different document layouts (e.g., newspapers, unstructured letters).
	* Resolution Sensitivity: The model relies heavily on a specific image preprocessing pipeline (resizing and padding to `1024x1024`). Feeding raw, unformatted images may degrade performance.

	### Recommendations

	Users should ensure that input images undergo the identical preprocessing steps used during training (Grayscale, Autocontrast, Denoising, Sharpening, and Padding) to achieve optimal extraction accuracy. Human-in-the-loop verification is recommended for critical legal digitization tasks.

	## How to Get Started with the Model

	Use the separated code blocks below to get started with the model using Unsloth.

	### Getting Model

	```python
	import torch
	import json
	from PIL import Image, ImageOps, ImageFilter
	from unsloth import FastVisionModel

	# 1. Load Model & Tokenizer
	model_id = "Humachine/egypt-constitution-vlm"
	model, tokenizer = FastVisionModel.from_pretrained(
	model_name=model_id,
	load_in_4bit=True,
	trust_remote_code=True
	)
	```

	### Image Preprocessing Logic
	```python
	def preprocess_image(image_path: str, target_size: tuple = (1024, 1024)) -> Image.Image:
	image = Image.open(image_path).convert('L')
	image = ImageOps.autocontrast(image, cutoff=1)
	image = image.filter(ImageFilter.MedianFilter(size=3))
	image = image.filter(ImageFilter.SHARPEN)
	image.thumbnail(target_size, Image.Resampling.LANCZOS)

	padded = Image.new('L', target_size, color=255)
	offset = ((target_size[0] - image.width) // 2, (target_size[1] - image.height) // 2)
	padded.paste(image, offset)
	return padded.convert('RGB')
	```


	### Prepare Inputs
	```python
	image_path = "/content/0100.jpg"
	image = preprocess_image(image_path)

	instruction = (
	"Extract all articles from this Arabic constitutional document. "
	"Return a JSON object with keys: page_metadata, hierarchy_context, and articles. "
	"Each article must include: article_id, article_number, content "
	"(body_text, key_entities, legal_intent), training_features (summary_ar, keywords), "
	"and text_raw."
	)

	messages = [
	{"role": "user", "content": [{"type": "image", "image": image}, {"type": "text", "text": instruction}]}
	]
	```

	### Generate Output
	```python
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

	inputs = tokenizer(text, return_tensors="pt").to("cuda",dtype=torch.bfloat16)

	output_tokens = model.generate(**inputs, max_new_tokens=2048, use_cache=True, temperature=0.2)
	output_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

	print(output_text)
	```

	## Citation

	If you use this model in your research or application, please cite it as follows:

	```json
	@misc{essam2026egypt,
	author = {Mahmoud Essam},
	title = {Egypt Constitution VLM: A Vision-Language Model for Arabic Legal JSON Extraction},
	journal = {Hugging Face Repositories},
	year = {2026},
	url = {https://huggingface.co/your-username/egypt-constitution-vlm}
	}
	```