razielAI's picture
Update README.md
5713665 verified
---
library_name: transformers
tags:
- chemistry
- biology
- finance
- legal
- music
- code
- art
- climate
- medical
- agent
- text-generation-inference
- duchifat-2
- hebrew
- AI
- conversational
- chatty
license: apache-2.0
language:
- he
- en
base_model:
- Raziel1234/Duchifat-2
pipeline_tag: text-generation
---
# ๐Ÿ•Š๏ธ Duchifat-2.3-Instruct: The Paradigm Shift in Hebrew AI
**Duchifat-2.3-Instruct** is a state-of-the-art, instruction-tuned Large Language Model developed by **TopAI**. As the flagship of the Duchifat series, this model represents a fundamental breakthrough in how Hebrew is processed, reasoned, and generated in the LLM era.
## ๐Ÿ’Ž The "Language-Native" Architecture
The core innovation of **Duchifat-2.3** lies in its **Language-Native Reasoning** engine. While most models suffer from a "Translation Gap"โ€”reasoning in English and translating to Hebrewโ€”Duchifat-2.3 was architected to bridge this divide.
### ๐Ÿง  Native Cognitive Processing
By optimizing the model's internal weights and tokenizer for Hebrew-specific structures, we have achieved a system that:
- **Internalizes Hebrew Logic:** The model's "Chain of Thought" is executed natively in Hebrew, preserving the unique semantic and syntactic nuances of the language.
- **Eliminates Syntactic Artifacts:** Unlike translated models, Duchifat-2.3 produces text that flows naturally, avoiding the stiff and robotic feel of English-to-Hebrew conversion.
- **Enhanced Token Efficiency:** The specialized architecture allows for a more dense and accurate representation of Hebrew text, leading to faster inference and better context retention.
---
## ๐Ÿš€ Advanced Instruction Tuning & Alignment
Duchifat-2.3-Instruct has undergone a sophisticated Supervised Fine-Tuning (SFT) process designed to transform a raw base model into a highly capable, mission-aligned assistant.
### ๐Ÿ›ก๏ธ Ethical Generalization & Safety
One of the model's most impressive feats is its ability to generalize safety protocols. It doesn't just rely on a static list of blocked words; it understands the **intent and context** of human interaction.
- **Zero-Shot Moderation:** The model can identify and appropriately handle offensive content, slurs, and harmful prompts it has never encountered during training.
- **Value-Locked Alignment:** The "TopAI" safety standards are deeply embedded, ensuring the model remains helpful, harmless, and honest across all domains.
### ๐Ÿค– Multi-Domain Mastery
The model is tuned to excel in diverse environments:
- **Technical & Scientific Research:** Deep understanding of AI architecture, software development, and complex data analysis.
- **Creative & Cultural Context:** Native fluency in Israeli idioms, professional drafting, and nuanced storytelling.
- **Logical Reasoning:** High performance in solving complex puzzles and following multi-stage instructions.
---
## ๐ŸŽจ The Duchifat Persona: A Digital Partner
We believe that interaction is as important as information. Duchifat-2.3-Instruct carries a unique, refined persona:
- **Quirky & Engaging:** It balances professional rigor with an approachable, brand-aligned voice.
- **Adaptive Tone:** Seamlessly shifts between formal technical documentation and casual, helpful conversation.
- **Identity-Aware:** The model "knows" who it is and remains consistent in its role as a specialized AI assistant.
---
## ๐Ÿ—๏ธ Technical Specifications
- **Developer:** TopAI
- **Architecture:** Causal Decoder-Only Transformer.
- **Primary Objective:** Hebrew-Native Instruction Following.
- **Secondary Capability:** Full English Fluency and Cross-Lingual reasoning.
- **Optimization:** Optimized for high-precision inference and minimal catastrophic forgetting.
---
## ๐Ÿ“Š Benchmark Results
The following evaluation was performed using `lm-evaluation-harness` (0-shot) to assess the model's core reasoning and common-sense capabilities.
| Task | Metric | Value | Significance |
| :--- | :--- | :--- | :--- |
| **PIQA** | Accuracy | **53.65%** | Above Random Guessing |
| **WinoGrande** | Accuracy | **52.25%** | Above Random Guessing |
| **ARC-Easy** | Accuracy (Norm) | **27.86%** | Baseline Performance |
| **HellaSwag** | Accuracy | **25.94%** | Baseline Performance |
**Analysis:**
Duchifat-2.3-Instruct shows its strongest performance in binary-choice logic tasks (**PIQA** and **WinoGrande**), consistently outperforming random chance. While multi-choice benchmarks like ARC and HellaSwag remain at baseline levels, this is a common trade-off for models aggressively fine-tuned for conversational alignment and Hebrew-native reasoning.
## Use
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# ื”ื’ื“ืจื•ืช - ื˜ืขื™ื ื” ืžื”-Hub
MODEL_ID = "razielAI/Duchifat-2.3-Instruct"
device = "cuda" if torch.cuda.is_available() else "cpu"
# ื˜ืขื™ื ื”
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
trust_remote_code=True,
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
).to(device)
def chat():
print("โœจ Duchifat-2 Online (TopAI) | Type 'exit' to quit")
while True:
user_input = input("\n๐Ÿ‘ค User: ")
if user_input.lower() in ["exit", "quit", "ื™ืฆื™ืื”"]: break
# ื‘ื ื™ื™ืช ื”ืคืจื•ืžืคื˜ ืขื ื”ื˜ื•ืงื ื™ื ื”ืžื™ื•ื—ื“ื™ื
prompt = f"<|instruction|>\n{user_input}\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
# ื™ืฆื™ืจื”
with torch.no_grad():
output_tokens = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.encode("<|eos|>", add_special_tokens=False)[0]
)
# ืคื™ืขื ื•ื— ื•ื”ืฆื’ืช ื”ืชืฉื•ื‘ื” ื‘ืœื‘ื“
decoded = tokenizer.decode(output_tokens[0], skip_special_tokens=False)
response = decoded.split("<|assistant|>")[-1].replace("<|eos|>", "").strip()
print(f"๐Ÿค– Duchifat-2: {response}")
if __name__ == "__main__":
chat()
```
## ๐ŸŒ Impact and Mission
Duchifat-2.3-Instruct is more than a model; it is a statement on the future of specialized AI. By proving that a dedicated, language-native approach can outperform general-purpose "translation" models, **TopAI** is setting a new standard for the Israeli and global tech ecosystem.
---
**Developed with technical excellence and linguistic precision by TopAI.**