| | --- |
| | language: |
| | - he |
| | - en |
| | license: llama3.2 |
| | base_model: meta-llama/Llama-3.2-1B-Instruct |
| | tags: |
| | - llama-3.2 |
| | - hebrew |
| | - instruction-tuned |
| | - sft |
| | - safetensors |
| | - nlp |
| | model_name: Hebrew-GPT |
| | model_type: causal-lm |
| | precision: bfloat16 |
| | --- |
| | |
| | # Hebrew-GPT: Specialized 1B Hebrew Instruction Model |
| |
|
| | **Hebrew-GPT** is a state-of-the-art, instruction-tuned Small Language Model (SLM) based on the **Llama-3.2-1B** architecture. It has been engineered to bridge the gap in low-parameter Hebrew linguistic performance, providing a compact yet powerful solution for Hebrew natural language understanding and generation. |
| |
|
| |
|
| |
|
| | --- |
| |
|
| | ## 馃拵 Model Highlights |
| |
|
| | * **Linguistic Specialization:** Specifically tuned to handle the Morphologically Rich Language (MRL) features of Hebrew, including prefix-suffix handling and correct right-to-left (RTL) context awareness. |
| | * **16-bit Precision:** Unlike many quantized small models, this version features **Full Merged BFloat16 weights**, ensuring no loss of intelligence from the fine-tuning process. |
| | * **Instruction Optimized:** Trained specifically to follow complex prompts, summarize documents, and engage in dialogue, rather than just basic text completion. |
| | * **Efficiency:** At 1 billion parameters, it is optimized for edge deployment, providing high-speed inference on standard consumer hardware. |
| |
|
| | --- |
| |
|
| | ## 馃洜 Technical Specifications |
| |
|
| | ### Architecture |
| | - **Base Architecture:** Llama 3.2 |
| | - **Parameters:** 1.23 Billion |
| | - **Context Length:** 128k tokens (native support) |
| | - **Weight Format:** Safetensors (Standalone) |
| | - **Precision:** BFloat16 ($BF16$) |
| |
|
| | ### Training Methodology |
| | The model underwent **Supervised Fine-Tuning (SFT)** using a curated multi-source dataset strategy to ensure high-quality Hebrew output without compromising logical reasoning: |
| | * **Hebrew Instruction Set (70%):** Extensive Alpaca-formatted datasets translated and corrected for Hebrew grammar. |
| | * **Hebrew Contextual Knowledge (20%):** Fact-based data from Hebrew wikis and structured Q&A. |
| | * **Logic Preservation (10%):** High-quality English instructional data to maintain cross-lingual reasoning and mathematical stability. |
| |
|
| | --- |
| |
|
| | ## 馃搱 Performance & Monitoring |
| |
|
| | During the development phase, the model was monitored via detailed telemetry to ensure stable convergence. Key metrics tracked included: |
| | - **Gradient Norm Stability:** Monitored to prevent exploding gradients in RTL text generation. |
| | - **VRAM Optimization:** Efficiently managed to maximize batch size and learning stability. |
| | - **Loss Decay:** Consistent downward trend in cross-entropy loss across all three data streams. |
| |
|
| |
|
| |
|
| | --- |
| |
|
| | ## 馃殌 Quick Start Guide |
| |
|
| | ### Installation |
| | ```bash |
| | pip install transformers torch accelerate |
| | ``` |
| | ### Basic Usage (Python) |
| | ```python |
| | import torch |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model_id = "XythicK/Hebrew-GPT" |
| | |
| | # Load model and tokenizer |
| | tokenizer = AutoTokenizer.from_pretrained(model_id) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | model_id, |
| | torch_dtype=torch.bfloat16, |
| | device_map="auto" |
| | ) |
| | |
| | # Standard Llama-3.2 Chat Template |
| | messages = [ |
| | {"role": "system", "content": "讗转讛 注讜讝专 讞讻诐 讜诪拽爪讜注讬 讘注讘专讬转."}, |
| | {"role": "user", "content": "讻转讜讘 诇讬 诪转讻讜谉 拽爪专 诇讞诇讛 诇砖讘转."}, |
| | ] |
| | |
| | input_ids = tokenizer.apply_chat_template( |
| | messages, |
| | add_generation_prompt=True, |
| | return_tensors="pt" |
| | ).to(model.device) |
| | |
| | outputs = model.generate( |
| | input_ids, |
| | max_new_tokens=256, |
| | do_sample=True, |
| | temperature=0.7, |
| | top_p=0.9, |
| | ) |
| | |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| | ### 鈿栵笍 Ethics and Limitations |
| | While Hebrew-GPT is highly capable for its size, users should note: |
| |
|
| | Hallucination: Like all LLMs, it can generate incorrect facts. Verify critical information. |
| |
|
| | Bias: The model reflects the biases present in its training data. |
| |
|
| | Parameter Constraints: As a 1B model, it may struggle with highly technical academic subjects compared to 70B+ models. |