XythicK commited on
Commit
5f1d64d
verified
1 Parent(s): 60945e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -8
README.md CHANGED
@@ -1,10 +1,115 @@
1
- ---
2
- base_model: unsloth/llama-3.2-1b-bnb-4bit
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - llama
7
- license: apache-2.0
8
  language:
 
9
  - en
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Expanded README with heavy focus on technical architecture and Hebrew linguistic nuances
2
+ big_readme_content = """---
 
 
 
 
 
3
  language:
4
+ - he
5
  - en
6
+ license: llama3.2
7
+ base_model: meta-llama/Llama-3.2-1B-Instruct
8
+ tags:
9
+ - llama-3.2
10
+ - hebrew
11
+ - instruction-tuned
12
+ - sft
13
+ - safetensors
14
+ - nlp
15
+ model_name: Hebrew-GPT
16
+ model_type: causal-lm
17
+ precision: bfloat16
18
+ ---
19
+
20
+ # Hebrew-GPT: Specialized 1B Hebrew Instruction Model 馃嚠馃嚤
21
+
22
+ **Hebrew-GPT** is a state-of-the-art, instruction-tuned Small Language Model (SLM) based on the **Llama-3.2-1B** architecture. It has been engineered to bridge the gap in low-parameter Hebrew linguistic performance, providing a compact yet powerful solution for Hebrew natural language understanding and generation.
23
+
24
+
25
+
26
+ ---
27
+
28
+ ## 馃拵 Model Highlights
29
+
30
+ * **Linguistic Specialization:** Specifically tuned to handle the Morphologically Rich Language (MRL) features of Hebrew, including prefix-suffix handling and correct right-to-left (RTL) context awareness.
31
+ * **16-bit Precision:** Unlike many quantized small models, this version features **Full Merged BFloat16 weights**, ensuring no loss of intelligence from the fine-tuning process.
32
+ * **Instruction Optimized:** Trained specifically to follow complex prompts, summarize documents, and engage in dialogue, rather than just basic text completion.
33
+ * **Efficiency:** At 1 billion parameters, it is optimized for edge deployment, providing high-speed inference on standard consumer hardware.
34
+
35
+ ---
36
+
37
+ ## 馃洜 Technical Specifications
38
+
39
+ ### Architecture
40
+ - **Base Architecture:** Llama 3.2
41
+ - **Parameters:** 1.23 Billion
42
+ - **Context Length:** 128k tokens (native support)
43
+ - **Weight Format:** Safetensors (Standalone)
44
+ - **Precision:** BFloat16 ($BF16$)
45
+
46
+ ### Training Methodology
47
+ The model underwent **Supervised Fine-Tuning (SFT)** using a curated multi-source dataset strategy to ensure high-quality Hebrew output without compromising logical reasoning:
48
+ * **Hebrew Instruction Set (70%):** Extensive Alpaca-formatted datasets translated and corrected for Hebrew grammar.
49
+ * **Hebrew Contextual Knowledge (20%):** Fact-based data from Hebrew wikis and structured Q&A.
50
+ * **Logic Preservation (10%):** High-quality English instructional data to maintain cross-lingual reasoning and mathematical stability.
51
+
52
+ ---
53
+
54
+ ## 馃搱 Performance & Monitoring
55
+
56
+ During the development phase, the model was monitored via detailed telemetry to ensure stable convergence. Key metrics tracked included:
57
+ - **Gradient Norm Stability:** Monitored to prevent exploding gradients in RTL text generation.
58
+ - **VRAM Optimization:** Efficiently managed to maximize batch size and learning stability.
59
+ - **Loss Decay:** Consistent downward trend in cross-entropy loss across all three data streams.
60
+
61
+
62
+
63
+ ---
64
+
65
+ ## 馃殌 Quick Start Guide
66
+
67
+ ### Installation
68
+ ```bash
69
+ pip install transformers torch accelerate
70
+ ```
71
+ ### Basic Usage (Python)
72
+ ```
73
+ import torch
74
+ from transformers import AutoModelForCausalLM, AutoTokenizer
75
+
76
+ model_id = "XythicK/Hebrew-GPT"
77
+
78
+ # Load model and tokenizer
79
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
80
+ model = AutoModelForCausalLM.from_pretrained(
81
+ model_id,
82
+ torch_dtype=torch.bfloat16,
83
+ device_map="auto"
84
+ )
85
+
86
+ # Standard Llama-3.2 Chat Template
87
+ messages = [
88
+ {"role": "system", "content": "讗转讛 注讜讝专 讞讻诐 讜诪拽爪讜注讬 讘注讘专讬转."},
89
+ {"role": "user", "content": "讻转讜讘 诇讬 诪转讻讜谉 拽爪专 诇讞诇讛 诇砖讘转."},
90
+ ]
91
+
92
+ input_ids = tokenizer.apply_chat_template(
93
+ messages,
94
+ add_generation_prompt=True,
95
+ return_tensors="pt"
96
+ ).to(model.device)
97
+
98
+ outputs = model.generate(
99
+ input_ids,
100
+ max_new_tokens=256,
101
+ do_sample=True,
102
+ temperature=0.7,
103
+ top_p=0.9,
104
+ )
105
+
106
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
107
+ ```
108
+ ### 鈿栵笍 Ethics and Limitations
109
+ While Hebrew-GPT is highly capable for its size, users should note:
110
+
111
+ Hallucination: Like all LLMs, it can generate incorrect facts. Verify critical information.
112
+
113
+ Bias: The model reflects the biases present in its training data.
114
+
115
+ Parameter Constraints: As a 1B model, it may struggle with highly technical academic subjects compared to 70B+ models.