Text Generation
Transformers
Safetensors
Hebrew
English
duchifat_v2
duchifat
agent
chemistry
biology
art
medical
climate
text-generation-inference
finance
music
legal
PyTorch
fine-tuned
instruct
custom_code
Instructions to use razielAI/Duchifat-2.1-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use razielAI/Duchifat-2.1-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="razielAI/Duchifat-2.1-Instruct", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("razielAI/Duchifat-2.1-Instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use razielAI/Duchifat-2.1-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "razielAI/Duchifat-2.1-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "razielAI/Duchifat-2.1-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/razielAI/Duchifat-2.1-Instruct
- SGLang
How to use razielAI/Duchifat-2.1-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "razielAI/Duchifat-2.1-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "razielAI/Duchifat-2.1-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "razielAI/Duchifat-2.1-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "razielAI/Duchifat-2.1-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use razielAI/Duchifat-2.1-Instruct with Docker Model Runner:
docker model run hf.co/razielAI/Duchifat-2.1-Instruct
File size: 6,809 Bytes
14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 814f1c8 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 0709e26 14d8f84 94f1031 0709e26 0b5ea60 0709e26 0b5ea60 0709e26 0b5ea60 0709e26 0b5ea60 0709e26 0b5ea60 0709e26 0b5ea60 0709e26 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 14d8f84 94f1031 814f1c8 94f1031 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | ---
library_name: transformers
tags:
- duchifat
- agent
- chemistry
- biology
- art
- medical
- climate
- text-generation-inference
- finance
- music
- legal
- PyTorch
- fine-tuned
- instruct
license: apache-2.0
language:
- he
- en
base_model:
- Raziel1234/Duchifat-2
pipeline_tag: text-generation
---
# Duchifat-2.1-Instruct: Technical Model Card & Documentation
## 1. Executive Summary
Duchifat-2.1-Instruct is a specialized Small Language Model (SLM) developed by razielAI at TopAI. The project aims to bridge the gap between compact model efficiency and high-density reasoning in bilingual (Hebrew/English) environments.
This model is a Full Parameter Fine-Tuned (FPFT) version of the Duchifat-2 architecture, specifically designed to serve as a baseline for instruction-following tasks, technical scripting, and brand-aligned communication.
---
## 2. Model Architecture & Training Philosophy
* **Core Architecture:** Optimized Transformer Decoder-only.
* **Parameter Count:** ~136M (Ultra-compact).
* **Fine-Tuning Method:** Supervised Fine-Tuning (SFT) focusing on Identity Injection and Logic Consistency.
* **Objective:** To provide a low-latency "Reasoning Engine" that can run on consumer-grade hardware without compromising on technical accuracy in English.
---
## 3. Targeted Competencies
### A. Technical Task Execution (English)
The model is optimized for software engineering workflows, including:
* **Modern Web Dev:** Scaffolding React applications with Vite and TypeScript.
* **Python Automation:** System monitoring, data processing, and asynchronous programming.
* **Logic Flow:** Structured step-by-step problem solving for algorithmic queries.
### B. Hebrew Identity & Alignment
Duchifat-2.1-Instruct is trained to represent the TopAI professional persona. It maintains a consistent "Senior Consultant" tone in Hebrew, making it suitable for internal automation and customer-facing interfaces.
### C. RAG (Retrieval-Augmented Generation) Compatibility
The model's training emphasized "Faithfulness to Prompt," which is a critical requirement for RAG systems. It is designed to act as a synthesizer of external knowledge bases.
---
## 4. Implementation Guide
### Installation
```bash
pip install transformers torch accelerate
```
### Usage Pattern
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# 1. הגדרות המודל מה-Hub
model_id = "razielAI/Duchifat-2.1-Instruct"
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"🔄 מתחיל טעינה של {model_id} מה-Hugging Face Hub...")
# 2. טעינת טוקנייזר ומודל (עם trust_remote_code כי זה מודל מותאם)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto" if device == "cuda" else None
).to(device)
model.eval()
def run_duchifat_inference(user_prompt):
# הפורמט המדויק שהמודל מכיר מהאימון
full_prompt = f"<instruction> {user_prompt} </instruction>\n<assistant> "
# הכנת ה-Inputs
inputs = tokenizer(full_prompt, return_tensors="pt", add_special_tokens=False).to(device)
with torch.no_grad():
outputs = model.generate(
input_ids=inputs["input_ids"],
attention_mask=inputs["attention_mask"],
max_new_tokens=20, # הגדלתי מעט לטובת תשובות קוד מפורטות
do_sample=False, # דטרמיניסטי לטובת דיוק טכני
repetition_penalty=4.5, # מניעת חזרתיות במודל קטן
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id
)
# פיענוח הטקסט המלא
full_text = tokenizer.decode(outputs[0], skip_special_tokens=False)
# לוגיקת חילוץ נקייה של התשובה מתוך ה-Tags
if "<assistant>" in full_text:
# לוקחים מה שבא אחרי assistant וחותכים ב-eos או בסגירת תגית
response = full_text.split("<assistant>")[-1]
response = response.replace("</assistant>", "").replace("<eos>", "").strip()
else:
response = full_text.strip()
return response
# --- לולאת צ'אט אינטראקטיבית ---
print("\n" + "="*60)
print("🚀 Duchifat-2.1-Instruct: Cloud Mode (Loaded from HF Hub)")
print("Identity: TopAI | Language: Hebrew & English | Ready for instructions.")
print("הקלד 'exit' או 'יציאה' כדי לסיים.")
print("="*60)
while True:
try:
user_input = input("\n👤 You: ")
if user_input.lower() in ["exit", "quit", "יציאה"]:
print("\n🤖 Duchifat-2.1: Closing session. Standby for the next mission... 👋")
break
if not user_input.strip():
continue
# הרצת האינפרנס
response = run_duchifat_inference(user_input)
# הדפסת התשובה
print(f"\n🤖 Duchifat-2.1: {response}")
except KeyboardInterrupt:
break
except Exception as e:
print(f"\n❌ Runtime Error: {e}")
print("\n" + "="*60)
```
---
## 5. Performance Evaluation (TBD)
*Note: Formal benchmarking and metric evaluation (e.g., MMLU, HumanEval) for this specific fine-tuned version are currently in progress.*
### Key Evaluation Areas:
* **Code Reliability:** Accuracy of generated syntax in Python/JS.
* **Instruction Adherence:** Success rate in following complex multi-step prompts.
* **Brand Consistency:** Alignment with the TopAI persona over long-turn conversations.
* **Latency:** Tokens-per-second measurement across various hardware (CPU/GPU).
---
## 6. Deployment & Quantization
Duchifat-2.1-Instruct's compact size makes it a prime candidate for:
* **Edge Computing:** Deployment on mobile devices or IoT gateways.
* **Private Cloud:** Secure, on-premise inference with minimal VRAM requirements.
* **Scalability:** High-throughput processing for microservices.
---
## 7. Ethical Considerations & Constraints
* **SLM Scope:** Users should note that as an SLM, the model excels at specific instructions rather than open-ended creative writing.
* **Bilingual Nuance:** While highly capable, users are encouraged to validate complex Hebrew grammar for high-stakes formal documentation.
* **Safety:** Standard LLM guardrails apply; the model should be used in conjunction with input/output filtering for production environments.
---
## 8. About TopAI
TopAI is an AI research and development hub focused on practical, efficient, and aligned AI solutions.
**Lead Developer:** Raziel
**Organization:** TopAI
**Status:** Version 2.1.0-Instruct (Active Development) |