File size: 4,056 Bytes
8570c18
 
505fbd2
 
 
 
 
 
 
 
 
 
 
 
8570c18
 
505fbd2
8570c18
505fbd2
8570c18
505fbd2
8570c18
505fbd2
8570c18
505fbd2
 
 
 
8570c18
505fbd2
8570c18
505fbd2
 
 
 
 
8570c18
505fbd2
8570c18
505fbd2
8570c18
505fbd2
 
 
8570c18
505fbd2
 
8570c18
505fbd2
8570c18
505fbd2
 
 
8570c18
505fbd2
 
 
 
 
 
 
8570c18
505fbd2
 
 
8570c18
505fbd2
8570c18
505fbd2
 
 
 
 
 
 
8570c18
505fbd2
 
8570c18
505fbd2
 
 
 
 
 
 
 
 
 
 
 
 
 
8570c18
505fbd2
 
8570c18
505fbd2
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
library_name: transformers
tags:
- ai
- chatty
- hoopoe2.4
- conversational
license: apache-2.0
language:
- he
- en
base_model:
- Raziel1234/Duchifat-2
pipeline_tag: text-generation
---

# Duchifat-2.4-Instruct (136M) ๐Ÿฆ

**Duchifat-2.4-Instruct** represents a significant evolution in the Duchifat series. This version (2.4) is a specialized, instruction-tuned model that has been refined through a massive training pipeline to achieve state-of-the-art performance for its size (136M parameters).

## ๐Ÿš€ Whatโ€™s New in Version 2.4?

Version 2.4 is not just a minor update; it's a complete refinement of the model's behavior and alignment:

- **Advanced Token Density:** v2.4 has been pushed to a total of **3.27 Billion tokens**, ensuring the model has reached peak saturation for its 136M architecture.
- **Structural Alignment:** Unlike previous iterations, 2.4 is natively aligned to the `<|instruction|>` and `<|assistant|>` tokens. The model now treats these as fundamental structural boundaries.
- **Hard-Coded EOS Logic:** We have fixed the termination issues from earlier versions. v2.4 is specifically trained to predict and emit the `<|eos|>` token at the precise end of every instruction and response block, ensuring clean and reliable chat sessions.
- **Improved Hebrew Fluency:** v2.4 leverages the DictaLM-3.0-24B tokenizer logic more effectively, resulting in a more natural "flow" of the Hebrew language without the stuttering found in smaller models.

## ๐ŸŒŸ Technical Highlights

- **Model Version:** 2.4 (Instruct)
- **Parameter Count:** 136M
- **Training Scale:** 3.27B Tokens (Mixed C4 Hebrew/English)
- **Architecture:** Optimized Transformer with RoPE and RMSNorm.
- **Inference Speed:** Ultra-low latency, ideal for real-time bilingual applications.

## ๐Ÿ’ป Implementation (v2.4)

To utilize the improved logic of v2.4, ensure you use `trust_remote_code=True` and follow the mandatory format.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

# ืชื™ืงื•ืŸ ื”ืื™ื•ืช ืœ-Instruct (ื”-r ืœืคื ื™ ื”-u)
model_id = "razielAI/Hoopoe-2.4-Instruct" 

print(f"ื˜ื•ืขืŸ ืืช ื”ืžื•ื“ืœ ื”ืฆื™ื‘ื•ืจื™ {model_id}... ื ื ืœื”ืžืชื™ืŸ.")

try:
    # ื˜ืขื™ื ืช ื”ื˜ื•ืงื ื™ื™ื–ืจ
    tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

    # ื˜ืขื™ื ืช ื”ืžื•ื“ืœ
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True
    )

    # ืกื ื›ืจื•ืŸ ื’ื•ื“ืœ ื”-Vocab
    if model.get_input_embeddings().weight.shape[0] != len(tokenizer):
        model.resize_token_embeddings(len(tokenizer))

    streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=False)

    def run_chat():
        print(f"\n--- {model_id} Chat Ready ---")
        model.eval()
        while True:
            user_input = input("\n๐Ÿ‘ค ืžืฉืชืžืฉ: ")
            if user_input.lower() in ["exit", "quit", "ื™ืฆื™ืื”", "ื‘ื™ื™"]:
                break

            prompt = f"<|instruction|>{user_input}<|eos|><|assistant|>"
            inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to(model.device)

            print("๐Ÿค– Hoopoe: ", end="")
            with torch.no_grad():
                model.generate(
                    input_ids=inputs["input_ids"],
                    attention_mask=inputs["attention_mask"],
                    max_new_tokens=512,
                    temperature=0.7,
                    do_sample=True,
                    pad_token_id=tokenizer.eos_token_id,
                    eos_token_id=tokenizer.eos_token_id,
                    repetition_penalty=1.15,
                    streamer=streamer
                )
            print()

    if __name__ == "__main__":
        run_chat()

except Exception as e:
    print(f"\nืฉื’ื™ืื” ื‘ื˜ืขื™ื ื”: {e}")
    print("\nืขืฆื”: ื›ื ืก ืœื“ืฃ ื”ืžื•ื“ืœ ื‘-Hugging Face ื•ืชื•ื•ื“ื ืฉืฉื ื”ืžืฉืชืžืฉ ื•ื”ืžื•ื“ืœ ื›ืชื•ื‘ื™ื ื‘ื“ื™ื•ืง ื›ืš.")
```