Raziel1234 commited on
Commit
16e0c9e
·
verified ·
1 Parent(s): d2fc4ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +136 -3
README.md CHANGED
@@ -1,3 +1,136 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - he
6
+ base_model:
7
+ - Raziel1234/Duchifat-2
8
+ pipeline_tag: text-generation
9
+ library_name: transformers
10
+ tags:
11
+ - chemistry
12
+ - agent
13
+ - medical
14
+ - climate
15
+ - code
16
+ - art
17
+ - music
18
+ - legal
19
+ - finance
20
+ - biology
21
+ - text-generation-inference
22
+ - Pytorch
23
+ - causal_lm
24
+ ---
25
+
26
+ # 🚀 Duchifat-V2-Instruct (דוכיפת 2) | Official Model Card
27
+
28
+ ## 📝 Executive Summary
29
+ **Duchifat-V2-Instruct** is a fine-tuned, instruction-following version of the Duchifat-V2 architecture (136M parameters). Developed by **TopAI**, this model is specifically optimized for creative content generation, bilingual dialogue, and task-oriented text processing.
30
+
31
+ While the base model provides a massive knowledge foundation from 3.27 billion tokens, the **Instruct** version has undergone targeted fine-tuning to transform it from a "text completer" into a **creative writer** capable of following complex prompts with a unique, human-like voice.
32
+
33
+ ---
34
+
35
+ ## 🏗️ Technical Specifications
36
+ | Component | Specification | Description |
37
+ | :--- | :--- | :--- |
38
+ | **Parameters** | 136 Million | Optimized for edge deployment and real-time inference. |
39
+ | **Architecture** | Decoder-only Transformer | Enhanced for causal reasoning and fluency. |
40
+ | **Layers / Heads** | 12 / 12 | Deep representation for nuanced semantics. |
41
+ | **Context Window** | 1024 Tokens | Supports creative long-form generation. |
42
+ | **Tokenizer** | DictaLM 2.0 | High-efficiency sub-word tokenization for Hebrew/English. |
43
+ | **Training Phase** | Post-5 Epoch Instruct | Refined for instruction-following & EOS consistency. |
44
+
45
+ ---
46
+
47
+ ## 🎨 Model Capabilities & "The Creative Writer"
48
+ Unlike standard small-scale models, **Duchifat-V2-Instruct** exhibits "Creative Personality." It excels at:
49
+ * **Narrative Writing:** Crafting stories and monologues with emotional depth.
50
+ * **Instruction Following:** Responding to specific system prompts and user constraints.
51
+ * **Bilingual Versatility:** Seamlessly switching between Hebrew and English based on the prompt's linguistic context.
52
+ * **Marketing & Copywriting:** Generating slogans, blog posts, and creative ads.
53
+
54
+ > **Note:** Due to its training on the C4 corpus, the model retains a vast "general knowledge" base, allowing it to act as a sophisticated creative partner rather than a purely technical agent.
55
+
56
+ ---
57
+
58
+ ## 📊 Training Infrastructure
59
+ * **Dataset:** Curated **C4** (3.27B Tokens) - 50% Hebrew, 50% English.
60
+ * **Fine-Tuning:** Instruction-tuning on high-quality conversational and creative datasets.
61
+ * **Optimization:** AdamW with a focus on preserving the pre-trained knowledge (Knowledge Retention).
62
+
63
+ ---
64
+
65
+ ## 💻 Implementation & Inference
66
+
67
+ To utilize the Instruct capabilities, use the following structure:
68
+
69
+ ```python
70
+ import torch
71
+ from transformers import AutoModelForCausalLM, AutoTokenizer
72
+
73
+ MODEL_ID = "TopAI-1/Duchifat-2-Instruct"
74
+
75
+ def run_duchifat_chat():
76
+ print("--- Loading Duchifat-2 (Post 5-Epoch Instruct Training) ---")
77
+
78
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
79
+ model = AutoModelForCausalLM.from_pretrained(
80
+ MODEL_ID,
81
+ trust_remote_code=True,
82
+ torch_dtype=torch.bfloat16,
83
+ device_map="auto"
84
+ )
85
+ model.eval()
86
+ model.config.use_cache = False
87
+
88
+ chat_history = []
89
+
90
+ print("--- Model Ready! ---")
91
+
92
+ while True:
93
+ user_input = input("\nהכנס הוראה (או 'יציאה'): ")
94
+ if user_input.lower() in ["exit", "quit", "יציאה"]:
95
+ break
96
+
97
+ # Add current instruction to memory
98
+ chat_history.append(f"Instruction: {user_input}")
99
+
100
+ # Build prompt with history
101
+ full_prompt = "\n".join(chat_history) + "\nContent:"
102
+
103
+ inputs = tokenizer(full_prompt, return_tensors="pt").to(model.device)
104
+
105
+ # Context Window Protection (Max 1024 tokens)
106
+ if inputs.input_ids.shape[1] > 850:
107
+ chat_history = chat_history[2:] # Trim oldest turn
108
+ full_prompt = "\n".join(chat_history) + "\nContent:"
109
+ inputs = tokenizer(full_prompt, return_tensors="pt").to(model.device)
110
+
111
+ with torch.no_grad():
112
+ output_tokens = model.generate(
113
+ input_ids=inputs.input_ids,
114
+ attention_mask=inputs.attention_mask,
115
+ max_new_tokens=300, # Increased for creative writing
116
+ do_sample=True,
117
+ temperature=0.75,
118
+ top_p=0.9,
119
+ repetition_penalty=1.15,
120
+ pad_token_id=tokenizer.eos_token_id,
121
+ use_cache=False
122
+ )
123
+
124
+ full_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
125
+
126
+ # Extract only the latest response
127
+ parts = full_text.split("Content:")
128
+ answer = parts[-1].strip()
129
+
130
+ # Save response to history for context
131
+ chat_history.append(f"Content: {answer}")
132
+
133
+ print(f"\nדוכיפת-2: {answer}")
134
+
135
+ if __name__ == "__main__":
136
+ run_duchifat_chat()