File size: 1,612 Bytes
a1e6af8
 
6e4a92a
a1e6af8
 
 
 
 
 
 
 
 
6e4a92a
a1e6af8
 
 
6e4a92a
a1e6af8
3dbe449
ada7afa
3dbe449
6e4a92a
3dbe449
a1e6af8
 
 
ee12b1f
 
a1e6af8
 
 
 
ee12b1f
a1e6af8
 
ee12b1f
a1e6af8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6e4a92a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: mit
base_model: meta-llama/Llama-3.3-70B-Instruct
tags:
- tiny-model
- random-weights
- testing
- llama
---

# Llama-3.3-Tiny-Instruct

This is a tiny random version of the meta-llama/Llama-3.3-70B-Instruct model, created for testing and experimentation purposes.

## Model Details

- **Base model**: meta-llama/Llama-3.3-70B-Instruct
- **Seed**: 42
- **Hidden size**: 256
- **Number of layers**: 12
- **Number of attention heads**: 4
- **Vocabulary size**: 128256
- **Max position embeddings**: 131072

## Parameters

- **Total parameters**: ~42,277,376
- **Trainable parameters**: ~42,277,376

## Usage

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Instruct")
tokenizer = AutoTokenizer.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Instruct")

# Generate text (note: this model has random weights!)
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))
```

## Important Notes

⚠️ **This model has random weights and is not trained!** It's designed for:
- Testing model loading and inference pipelines
- Benchmarking model architecture
- Educational purposes
- Rapid prototyping where actual model performance isn't needed

The model will generate random/nonsensical text since it hasn't been trained on any data.

## Creation

This model was created using the `upload_tiny_llama.py` script from the minimal-grpo-trainer repository.