Update README.md
Browse files
README.md
CHANGED
|
@@ -5,9 +5,84 @@ tags: []
|
|
| 5 |
|
| 6 |
# Model Card for Model ID
|
| 7 |
|
| 8 |
-
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
|
|
|
| 11 |
|
| 12 |
## Model Details
|
| 13 |
|
|
|
|
| 5 |
|
| 6 |
# Model Card for Model ID
|
| 7 |
|
| 8 |
+
Patched LLama 3.2 8B from LLaMA 3.2 11B Model
|
| 9 |
+
|
| 10 |
+
Here’s the complete, refined code for patching the weights:
|
| 11 |
+
```python
|
| 12 |
+
# Import required libraries
|
| 13 |
+
from transformers import AutoProcessor, AutoTokenizer, AutoModelForImageTextToText, AutoModelForCausalLM
|
| 14 |
+
|
| 15 |
+
# Load the 11B Vision-Instruct model
|
| 16 |
+
processor = AutoProcessor.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct")
|
| 17 |
+
model = AutoModelForImageTextToText.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct")
|
| 18 |
+
|
| 19 |
+
# Load the 8B text-only model
|
| 20 |
+
s_tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
|
| 21 |
+
s_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
|
| 22 |
+
|
| 23 |
+
# Prepare input text for testing
|
| 24 |
+
input_text = "Write me a poem about Machine Learning."
|
| 25 |
+
input_ids = s_tokenizer(input_text, return_tensors="pt")
|
| 26 |
+
|
| 27 |
+
# Test the original 8B model
|
| 28 |
+
outputs = s_model.generate(**input_ids, do_sample=False, max_new_tokens=10)
|
| 29 |
+
print("8B Model Output:", s_tokenizer.decode(outputs[0]))
|
| 30 |
+
|
| 31 |
+
# Patch weights from the 11B model into the 8B model
|
| 32 |
+
model_weight = model.state_dict()
|
| 33 |
+
s_model_dict = s_model.state_dict()
|
| 34 |
+
skip_layer = 0 # Track skipped layers
|
| 35 |
+
|
| 36 |
+
for key in s_model_dict.keys():
|
| 37 |
+
if "layers." in key:
|
| 38 |
+
layer_idx = int(key.split("layers.")[1].split(".")[0]) # Extract layer index
|
| 39 |
+
try:
|
| 40 |
+
s_model_dict[key] = model_weight[
|
| 41 |
+
"language_model." + key.replace(f"layers.{layer_idx}.", f"layers.{layer_idx + skip_layer}.")
|
| 42 |
+
]
|
| 43 |
+
except KeyError:
|
| 44 |
+
skip_layer += 1
|
| 45 |
+
s_model_dict[key] = model_weight[
|
| 46 |
+
"language_model." + key.replace(f"layers.{layer_idx}.", f"layers.{layer_idx + skip_layer}.")
|
| 47 |
+
]
|
| 48 |
+
else:
|
| 49 |
+
s_model_dict[key] = model_weight["language_model." + key]
|
| 50 |
+
|
| 51 |
+
# Test the patched 8B model
|
| 52 |
+
outputs = s_model.generate(**input_ids, do_sample=False, max_new_tokens=10)
|
| 53 |
+
print("Patched 8B Model Output:", s_tokenizer.decode(outputs[0]))
|
| 54 |
+
|
| 55 |
+
# Test the original 11B model
|
| 56 |
+
outputs = model.generate(**input_ids, do_sample=False, max_new_tokens=10)
|
| 57 |
+
print("11B Model Output:", s_tokenizer.decode(outputs[0]))
|
| 58 |
+
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
### **Example Outputs**
|
| 62 |
+
|
| 63 |
+
**Prompt:** "Write me a poem about Machine Learning."
|
| 64 |
+
|
| 65 |
+
**Outputs:**
|
| 66 |
+
1. **8B Model Output (Before Patching):**
|
| 67 |
+
```
|
| 68 |
+
<|begin_of_text|>Write me a poem about Machine Learning.
|
| 69 |
+
Artificial minds, born from code,
|
| 70 |
+
Learning
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
2. **Patched 8B Model Output:**
|
| 74 |
+
```
|
| 75 |
+
<|begin_of_text|>Write me a poem about Machine Learning.
|
| 76 |
+
In silicon halls, where data reigns
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
3. **11B Model Output:**
|
| 80 |
+
```
|
| 81 |
+
<|begin_of_text|>Write me a poem about Machine Learning.
|
| 82 |
+
In silicon halls, where data reigns
|
| 83 |
+
```
|
| 84 |
|
| 85 |
+
---
|
| 86 |
|
| 87 |
## Model Details
|
| 88 |
|