wassname
/

gemma3-5lyr-tiny-random

Model card Files Files and versions

gemma3-5lyr-tiny-random / README.md

wassname's picture

Create README.md

b24e336 verified about 1 month ago

|

history blame contribute delete

1.71 kB

	---
	base_model:
	- google/gemma-3-1b-it
	tags:
	- tunix
	- jax
	---
	tiny random Gemma 3 model (5 layers) for tunix


	```py
	"""Create a tiny random Gemma 3 model (5 layers) and upload to HuggingFace.

	One-off script. The model is text-only with random weights, intended for
	fast smoke tests with tunix/JAX

	Usage: uv run python scripts/create_tiny_gemma3.py
	"""

	import torch
	from transformers import (
	AutoTokenizer,
	GenerationConfig,
	set_seed,
	)
	from transformers.models.gemma3 import Gemma3ForCausalLM, Gemma3TextConfig
	from huggingface_hub import HfApi

	source_model_id = "google/gemma-3-1b-it"
	repo_id = "wassname/gemma3-5lyr-tiny-random"
	save_folder = "/tmp/tiny-random/gemma3-5lyr"

	# Tokenizer from source (same vocab)
	tokenizer = AutoTokenizer.from_pretrained(source_model_id)
	tokenizer.save_pretrained(save_folder)

	# Tiny text-only config matching tunix ModelConfig in model.py
	config = Gemma3TextConfig(
	vocab_size=262144,
	hidden_size=64,
	intermediate_size=128,
	num_hidden_layers=5,
	num_attention_heads=2,
	head_dim=32,
	num_key_value_heads=1,
	sliding_window=512,
	tie_word_embeddings=True,
	)
	config._name_or_path = source_model_id

	model = Gemma3ForCausalLM(config).to(torch.bfloat16)

	# Random init
	set_seed(42)
	with torch.no_grad():
	for name, p in sorted(model.named_parameters()):
	torch.nn.init.normal_(p, 0, 0.5)
	print(name, p.shape)

	model.generation_config = GenerationConfig.from_pretrained(source_model_id)
	model.save_pretrained(save_folder)

	# Upload
	api = HfApi()
	api.create_repo(repo_id, exist_ok=True)
	api.upload_folder(folder_path=save_folder, repo_id=repo_id)
	print(f"Uploaded to https://huggingface.co/{repo_id}")

	```