Text Generation
Transformers
Safetensors
Hebrew
English
duchifat_v2
duchifat
agent
chemistry
biology
art
medical
climate
text-generation-inference
finance
music
legal
PyTorch
fine-tuned
instruct
custom_code
Instructions to use razielAI/Duchifat-2.1-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use razielAI/Duchifat-2.1-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="razielAI/Duchifat-2.1-Instruct", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("razielAI/Duchifat-2.1-Instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use razielAI/Duchifat-2.1-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "razielAI/Duchifat-2.1-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "razielAI/Duchifat-2.1-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/razielAI/Duchifat-2.1-Instruct
- SGLang
How to use razielAI/Duchifat-2.1-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "razielAI/Duchifat-2.1-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "razielAI/Duchifat-2.1-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "razielAI/Duchifat-2.1-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "razielAI/Duchifat-2.1-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use razielAI/Duchifat-2.1-Instruct with Docker Model Runner:
docker model run hf.co/razielAI/Duchifat-2.1-Instruct
| library_name: transformers | |
| tags: | |
| - duchifat | |
| - agent | |
| - chemistry | |
| - biology | |
| - art | |
| - medical | |
| - climate | |
| - text-generation-inference | |
| - finance | |
| - music | |
| - legal | |
| - PyTorch | |
| - fine-tuned | |
| - instruct | |
| license: apache-2.0 | |
| language: | |
| - he | |
| - en | |
| base_model: | |
| - Raziel1234/Duchifat-2 | |
| pipeline_tag: text-generation | |
| # Duchifat-2.1-Instruct: Technical Model Card & Documentation | |
| ## 1. Executive Summary | |
| Duchifat-2.1-Instruct is a specialized Small Language Model (SLM) developed by razielAI at TopAI. The project aims to bridge the gap between compact model efficiency and high-density reasoning in bilingual (Hebrew/English) environments. | |
| This model is a Full Parameter Fine-Tuned (FPFT) version of the Duchifat-2 architecture, specifically designed to serve as a baseline for instruction-following tasks, technical scripting, and brand-aligned communication. | |
| --- | |
| ## 2. Model Architecture & Training Philosophy | |
| * **Core Architecture:** Optimized Transformer Decoder-only. | |
| * **Parameter Count:** ~136M (Ultra-compact). | |
| * **Fine-Tuning Method:** Supervised Fine-Tuning (SFT) focusing on Identity Injection and Logic Consistency. | |
| * **Objective:** To provide a low-latency "Reasoning Engine" that can run on consumer-grade hardware without compromising on technical accuracy in English. | |
| --- | |
| ## 3. Targeted Competencies | |
| ### A. Technical Task Execution (English) | |
| The model is optimized for software engineering workflows, including: | |
| * **Modern Web Dev:** Scaffolding React applications with Vite and TypeScript. | |
| * **Python Automation:** System monitoring, data processing, and asynchronous programming. | |
| * **Logic Flow:** Structured step-by-step problem solving for algorithmic queries. | |
| ### B. Hebrew Identity & Alignment | |
| Duchifat-2.1-Instruct is trained to represent the TopAI professional persona. It maintains a consistent "Senior Consultant" tone in Hebrew, making it suitable for internal automation and customer-facing interfaces. | |
| ### C. RAG (Retrieval-Augmented Generation) Compatibility | |
| The model's training emphasized "Faithfulness to Prompt," which is a critical requirement for RAG systems. It is designed to act as a synthesizer of external knowledge bases. | |
| --- | |
| ## 4. Implementation Guide | |
| ### Installation | |
| ```bash | |
| pip install transformers torch accelerate | |
| ``` | |
| ### Usage Pattern | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| # 1. הגדרות המודל מה-Hub | |
| model_id = "razielAI/Duchifat-2.1-Instruct" | |
| device = "cuda" if torch.cuda.is_available() else "cpu" | |
| print(f"🔄 מתחיל טעינה של {model_id} מה-Hugging Face Hub...") | |
| # 2. טעינת טוקנייזר ומודל (עם trust_remote_code כי זה מודל מותאם) | |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" if device == "cuda" else None | |
| ).to(device) | |
| model.eval() | |
| def run_duchifat_inference(user_prompt): | |
| # הפורמט המדויק שהמודל מכיר מהאימון | |
| full_prompt = f"<instruction> {user_prompt} </instruction>\n<assistant> " | |
| # הכנת ה-Inputs | |
| inputs = tokenizer(full_prompt, return_tensors="pt", add_special_tokens=False).to(device) | |
| with torch.no_grad(): | |
| outputs = model.generate( | |
| input_ids=inputs["input_ids"], | |
| attention_mask=inputs["attention_mask"], | |
| max_new_tokens=20, # הגדלתי מעט לטובת תשובות קוד מפורטות | |
| do_sample=False, # דטרמיניסטי לטובת דיוק טכני | |
| repetition_penalty=4.5, # מניעת חזרתיות במודל קטן | |
| eos_token_id=tokenizer.eos_token_id, | |
| pad_token_id=tokenizer.eos_token_id | |
| ) | |
| # פיענוח הטקסט המלא | |
| full_text = tokenizer.decode(outputs[0], skip_special_tokens=False) | |
| # לוגיקת חילוץ נקייה של התשובה מתוך ה-Tags | |
| if "<assistant>" in full_text: | |
| # לוקחים מה שבא אחרי assistant וחותכים ב-eos או בסגירת תגית | |
| response = full_text.split("<assistant>")[-1] | |
| response = response.replace("</assistant>", "").replace("<eos>", "").strip() | |
| else: | |
| response = full_text.strip() | |
| return response | |
| # --- לולאת צ'אט אינטראקטיבית --- | |
| print("\n" + "="*60) | |
| print("🚀 Duchifat-2.1-Instruct: Cloud Mode (Loaded from HF Hub)") | |
| print("Identity: TopAI | Language: Hebrew & English | Ready for instructions.") | |
| print("הקלד 'exit' או 'יציאה' כדי לסיים.") | |
| print("="*60) | |
| while True: | |
| try: | |
| user_input = input("\n👤 You: ") | |
| if user_input.lower() in ["exit", "quit", "יציאה"]: | |
| print("\n🤖 Duchifat-2.1: Closing session. Standby for the next mission... 👋") | |
| break | |
| if not user_input.strip(): | |
| continue | |
| # הרצת האינפרנס | |
| response = run_duchifat_inference(user_input) | |
| # הדפסת התשובה | |
| print(f"\n🤖 Duchifat-2.1: {response}") | |
| except KeyboardInterrupt: | |
| break | |
| except Exception as e: | |
| print(f"\n❌ Runtime Error: {e}") | |
| print("\n" + "="*60) | |
| ``` | |
| --- | |
| ## 5. Performance Evaluation (TBD) | |
| *Note: Formal benchmarking and metric evaluation (e.g., MMLU, HumanEval) for this specific fine-tuned version are currently in progress.* | |
| ### Key Evaluation Areas: | |
| * **Code Reliability:** Accuracy of generated syntax in Python/JS. | |
| * **Instruction Adherence:** Success rate in following complex multi-step prompts. | |
| * **Brand Consistency:** Alignment with the TopAI persona over long-turn conversations. | |
| * **Latency:** Tokens-per-second measurement across various hardware (CPU/GPU). | |
| --- | |
| ## 6. Deployment & Quantization | |
| Duchifat-2.1-Instruct's compact size makes it a prime candidate for: | |
| * **Edge Computing:** Deployment on mobile devices or IoT gateways. | |
| * **Private Cloud:** Secure, on-premise inference with minimal VRAM requirements. | |
| * **Scalability:** High-throughput processing for microservices. | |
| --- | |
| ## 7. Ethical Considerations & Constraints | |
| * **SLM Scope:** Users should note that as an SLM, the model excels at specific instructions rather than open-ended creative writing. | |
| * **Bilingual Nuance:** While highly capable, users are encouraged to validate complex Hebrew grammar for high-stakes formal documentation. | |
| * **Safety:** Standard LLM guardrails apply; the model should be used in conjunction with input/output filtering for production environments. | |
| --- | |
| ## 8. About TopAI | |
| TopAI is an AI research and development hub focused on practical, efficient, and aligned AI solutions. | |
| **Lead Developer:** Raziel | |
| **Organization:** TopAI | |
| **Status:** Version 2.1.0-Instruct (Active Development) |