metadata
language: en
license: mit
tags:
- language-model
- pytorch
- rnn
- text-generation
- gru
- tiny-stories
- bpe-tokenizer
datasets:
- aditya-6122/tinystories-custom-dataset-18542-v2-test
pipeline_tag: text-generation
widget:
- text: >
Once upon a time there was a lonely robot. He was very lonely and lonely.
He wanted to be friends with everyone. He thought he would never be
lonely.
One day, he decided to take a walk. He walked up to the robot and said,
"Hello, robot! Can I be your friend?" The robot was very happy and said,
"Yes, I would like that. I would like to be your friend."
The robot was very happy. He had been given a friend to the robot. He was
so happy to have a friend. He would always be friends with the robot.
The robot was very happy. He had made a new friend. He was no longer
lonely. He had a friend who could talk and play with him. The robot was so
happy to have a friend. They played together every day and were never
lonely again.
context: Once there was a lonely robot
example_title: Robot Story
- text: >
A child found a mysterious key. The child was very curious and wanted to
open it. He asked his mom for help, but her mom said no. She said that if
he could open the lock, the key would lock the door open.
The child was sad and he didn't know what to do. He asked his mom for
help. She said, "Let's try to unlock the door. It will be open and lock
the door open."
So the child locked the door and locked the door. He was so excited to
open it and see what was inside. He opened the door and saw the key. He
opened it and inside was a big, bouncy ball. He was so excited and he ran
to the ball.
He opened the door and saw the ball inside. He was so excited! He ran
around the house with the key and the ball bounced around. He was so
happy!
The child was so excited that he ran around the house with his key. He ran
around the garden, playing with the key and the ball. He had so much fun
that he forgot all about the key.
The moral of the story is that it is important to be social and obey the
rules.
context: A child found a mysterious key
example_title: Mystery Story
- text: >
In a world where time stops at night. The stars were twinkling and the
stars were twinkling.
One day, a little girl named Lily came to visit the house. She saw the
stars and the moon. She asked her mom, "What is that?"
Her mom smiled and said, "That's a star. It is called a peony. It has
stars and lights and people who are very happy. Would you like to try?"
Lily nodded and said, "Yes please!" She grabbed a bright blue star and a
bright yellow star. She was so proud of her star.
The next day, Lily went to the park with her mom. She saw the stars and
the moon. She was so happy. She ran around the house with her mom and dad.
They had a wonderful time.
context: In a world where time stops at night
example_title: Fantasy Story
Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding
Model Details
Model Description
This is a custom language model trained on a dataset of short stories, designed for text generation tasks.
Architecture
Model Sources
- Repository: Aditya6122/BuildingLanguageModel-TinyStories
Uses
Direct Use
This model can be used for generating short stories and text completion tasks.
Downstream Use
Fine-tune the model on specific domains for specialized text generation.
Out-of-Scope Use
Not intended for production use without further validation.
Training Details
Training Data
The model was trained on the aditya-6122/tinystories-custom-dataset-18542-v2-test dataset.
Training Procedure
- Training Regime: Standard language model training with cross-entropy loss
- Epochs: 5
- Batch Size: 128
- Learning Rate: 0.001
- Optimizer: Adam (assumed)
- Hardware: Apple Silicon MPS (if available) or CPU
Tokenizer
The model uses the aditya-6122/tinystories-tokenizer-vb-18542-byte_level_bpe-v3-test tokenizer.
Model Architecture
- Architecture Type: RNN-based language model with GRU cells
- Embedding Dimension: 512
- Hidden Dimension: 1024
- Vocabulary Size: 18542
- Architecture Diagram: See
model_arch.jpgfor visual representation
Files
model.bin: The trained model weights in PyTorch format.tokenizer.json: The tokenizer configuration.model_arch.jpg: Architecture diagram showing the GRU model structure.
How to Use
Since this is a custom model, you'll need to load it using the provided code:
import torch
from your_language_model import LanguageModel # Replace with actual import
from tokenizers import Tokenizer
# Load tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
# Load model
vocab_size = tokenizer.get_vocab_size()
model = LanguageModel(vocab_size=vocab_size, embedding_dimension=512, hidden_dimension=1024)
model.load_state_dict(torch.load("model.bin"))
model.eval()
# Generate text
input_text = "Once upon a time"
# Tokenize and generate [Add your Generation Logic]
Limitations
- This is a basic RNN model and may not perform as well as transformer-based models.
- Trained on limited data, may exhibit biases from the training dataset.
- Not optimized for production deployment.
Ethical Considerations
Users should be aware of potential biases in generated text and use the model responsibly.
Citation
If you use this model, please cite:
@misc{vanilla-rnn-gru-like},
title={Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding},
author={Aditya Wath},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/aditya-6122/Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding}
}