Instructions to use raincandy-u/TinyStories-656K with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use raincandy-u/TinyStories-656K with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="raincandy-u/TinyStories-656K")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("raincandy-u/TinyStories-656K")
model = AutoModelForCausalLM.from_pretrained("raincandy-u/TinyStories-656K")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use raincandy-u/TinyStories-656K with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "raincandy-u/TinyStories-656K"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "raincandy-u/TinyStories-656K",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/raincandy-u/TinyStories-656K

SGLang

How to use raincandy-u/TinyStories-656K with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "raincandy-u/TinyStories-656K" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "raincandy-u/TinyStories-656K",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "raincandy-u/TinyStories-656K" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "raincandy-u/TinyStories-656K",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use raincandy-u/TinyStories-656K with Docker Model Runner:
```
docker model run hf.co/raincandy-u/TinyStories-656K
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

TinyStories-656K

This is a LM trained from scratch on TinyStoriesV2 dataset.

Aims to be a transformer language model capable of generating story with only 600k~ of parameters.

Llama Architecture
GQA
hidden_size = 128
Use tie_word_embeddings
vocab_size=2048 (Trained on TinystoriesV2 from scratch, using BPE)
2 Transformers Layers

Code: Here

Full Training Arguments

training_args = TrainingArguments(
  do_train=True,
  per_device_train_batch_size=16,
  gradient_accumulation_steps=1,
  learning_rate=0.004629403549377777,
  lr_scheduler_type="constant",
  bf16=True,
  logging_steps=5,
  num_train_epochs=2,
  save_steps=10000000,
  seed=3407,report_to=None
)

Generation

Template:

<|start_story|>Once upon a time,

Generation example:

Once upon a time, there was a little boy named Tim. Tim had a toy car that he loved to play with. One day, he went to the park with his mom. Tim saw a toy car on the ground. Tim wanted to play with the car to his mom and said, "Mom, can I play with your car with my car too?"
His mom said, "Yes, but we must not take turns." Tim felt sad, but he knew he had to go. He asked his mom for help. His mom said, "Okay, let's clean it together." They went to play together and played the toy car. They had a lot of fun.
After they finished the car together, Tim and his mom were surprised. They did not know that the car was not a toy car like it was a magic car. Tim had an idea. He put the car in the car and put the car on it. He pushed the car on the car on the car car and pulled it down. Tim was so happy. He played with the car with his car all day long, and Tim was very happy.<|end_story|>

Recommended generation config:

do_sample=True,
top_k=40,
top_p=0.9,
temperature=0.6

Downloads last month: 148

Safetensors

Model size

656k params

Tensor type

BF16

Model tree for raincandy-u/TinyStories-656K

Quantizations

3 models

raincandy-u
/

TinyStories-656K

TinyStories-656K

Full Training Arguments

Generation

Model tree for raincandy-u/TinyStories-656K

Dataset used to train raincandy-u/TinyStories-656K

Space using raincandy-u/TinyStories-656K 1