Text Generation
Transformers
Safetensors
English
gemma4
gemma4-text
gemma4-moe
Mixture of Experts
mixture-of-experts
causal-lm
tinystories
tiny-model
validation
debug-model
Instructions to use shibatch/tinygemma4moe5m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use shibatch/tinygemma4moe5m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="shibatch/tinygemma4moe5m")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("shibatch/tinygemma4moe5m", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use shibatch/tinygemma4moe5m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "shibatch/tinygemma4moe5m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shibatch/tinygemma4moe5m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/shibatch/tinygemma4moe5m
- SGLang
How to use shibatch/tinygemma4moe5m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "shibatch/tinygemma4moe5m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shibatch/tinygemma4moe5m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "shibatch/tinygemma4moe5m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shibatch/tinygemma4moe5m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use shibatch/tinygemma4moe5m with Docker Model Runner:
docker model run hf.co/shibatch/tinygemma4moe5m
| { | |
| "model_type": "gemma4_text", | |
| "parameter_count": 5370128, | |
| "tokenizer_size": 1024, | |
| "final_eval_loss": 2.466223752498627, | |
| "final_eval_ppl": 11.777886554570602, | |
| "args": { | |
| "output_dir": "tinygemma4moe_text_cov", | |
| "train_files": [ | |
| "data/train-00000-of-00004-2d5a1467fff1081b.parquet", | |
| "data/train-00001-of-00004-5852b56a2bd28fd9.parquet", | |
| "data/train-00002-of-00004-a26307300439e943.parquet", | |
| "data/train-00003-of-00004-d243063613e5a057.parquet" | |
| ], | |
| "max_rows": 0, | |
| "vocab_size": 1024, | |
| "min_frequency": 2, | |
| "hidden_size": 160, | |
| "intermediate_size": 640, | |
| "num_hidden_layers": 6, | |
| "num_attention_heads": 5, | |
| "num_key_value_heads": 1, | |
| "head_dim": 32, | |
| "hidden_size_per_layer_input": 24, | |
| "sliding_window": 128, | |
| "enable_moe_block": true, | |
| "num_experts": 4, | |
| "top_k_experts": 2, | |
| "expert_interval": 2, | |
| "router_aux_loss_coef": 0.0, | |
| "moe_intermediate_size": 320, | |
| "max_position_embeddings": 1024, | |
| "layer_pattern": "ssFssF", | |
| "block_size": 256, | |
| "batch_size": 32, | |
| "max_steps": 76597, | |
| "num_epochs": 1.0, | |
| "learning_rate": 0.0002, | |
| "weight_decay": 0.0, | |
| "grad_clip": 1.0, | |
| "log_steps": 50, | |
| "eval_steps": 500, | |
| "eval_batches": 20, | |
| "seed": 1234, | |
| "device": "auto", | |
| "dtype": "auto", | |
| "num_workers": 0, | |
| "generate_prompts": [ | |
| "Once upon", | |
| "There was a little", | |
| "One day" | |
| ], | |
| "generate_new_tokens": 80 | |
| }, | |
| "generations": { | |
| "Once upon": "Once upon a time, there was a little girl named Lily. She loved to play with her toys and her friends. One day, Lily's mom said, \"Lily, you can't play with your toys. It's not a toy. It's a big, big, big house.\"\n\nLily was sad and said, \"I'm sorry, Lily. I'm sorry, I'm sorry.", | |
| "There was a little": "There was a little girl named Lily. She loved to play with her toys and her friends. One day, Lily's mom said, \"Lily, you can't play with your toys. It's not a toy. It's a big toys.\"\n\nLily said, \"I want to play with it. It's a big toys. It's a big toys. It's a big toys.\"\n\nLily", | |
| "One day": "One day, a little girl named Lily went to the park with her mom. She saw a big, big tree and wanted to play with it. She saw a big, big tree and wanted to play with it. She saw a big tree and wanted to play with it.\n\n\"Hello, little girl!\" said her mom. \"I want to play with you!\"\n\nThe little" | |
| } | |
| } |