File size: 1,612 Bytes
a1e6af8 6e4a92a a1e6af8 6e4a92a a1e6af8 6e4a92a a1e6af8 3dbe449 ada7afa 3dbe449 6e4a92a 3dbe449 a1e6af8 ee12b1f a1e6af8 ee12b1f a1e6af8 ee12b1f a1e6af8 6e4a92a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
license: mit
base_model: meta-llama/Llama-3.3-70B-Instruct
tags:
- tiny-model
- random-weights
- testing
- llama
---
# Llama-3.3-Tiny-Instruct
This is a tiny random version of the meta-llama/Llama-3.3-70B-Instruct model, created for testing and experimentation purposes.
## Model Details
- **Base model**: meta-llama/Llama-3.3-70B-Instruct
- **Seed**: 42
- **Hidden size**: 256
- **Number of layers**: 12
- **Number of attention heads**: 4
- **Vocabulary size**: 128256
- **Max position embeddings**: 131072
## Parameters
- **Total parameters**: ~42,277,376
- **Trainable parameters**: ~42,277,376
## Usage
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Instruct")
tokenizer = AutoTokenizer.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Instruct")
# Generate text (note: this model has random weights!)
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))
```
## Important Notes
⚠️ **This model has random weights and is not trained!** It's designed for:
- Testing model loading and inference pipelines
- Benchmarking model architecture
- Educational purposes
- Rapid prototyping where actual model performance isn't needed
The model will generate random/nonsensical text since it hasn't been trained on any data.
## Creation
This model was created using the `upload_tiny_llama.py` script from the minimal-grpo-trainer repository.
|