metadata
language: en
license: mit
tags:
- language-model
- pytorch
- rnn
- text-generation
- gru
- tiny-stories
- bpe-tokenizer
datasets:
- aditya-6122/tinystories-custom-dataset-18542-v2-test
pipeline_tag: text-generation
widget:
- text: Once there was a lonely robot.
example_title: Robot Story
output:
text: >
Once upon a time there was a lonely robot. He was very lonely and
lonely. He wanted to be friends with everyone. He thought he would never
be lonely. One day, he decided to take a walk. He walked up to the robot
and said, "Hello, robot! Can I be your friend?" The robot was very happy
and said, "Yes, I would like that. I would like to be your friend." The
robot was very happy. He had been given a friend to the robot. He was so
happy to have a friend. He would always be friends with the robot. The
robot was very happy. He had made a new friend. He was no longer lonely.
He had a friend who could talk and play with him. The robot was so happy
to have a friend. They played together every day and were never lonely
again.
- text: A child found a mysterious key.
example_title: Mystery Key
output:
text: >
A child found a mysterious key. The child was very curious and wanted to
open it. He asked his mom for help, but her mom said no. She said that
if he could open the lock, the key would lock the door open. The child
was sad and he didn't know what to do. He asked his mom for help. She
said, "Let's try to unlock the door. It will be open and lock the door
open." So the child locked the door and locked the door. He was so
excited to open it and see what was inside. He opened the door and saw
the key. He opened it and inside was a big, bouncy ball. He was so
excited and he ran to the ball. He opened the door and saw the ball
inside. He was so excited! He ran around the house with the key and the
ball bounced around. He was so happy! The child was so excited that he
ran around the house with his key. He ran around the garden, playing
with the key and the ball. He had so much fun that he forgot all about
the key. The moral of the story is that it is important to be social and
obey the rules.
- text: In a world where time stops at night.
example_title: Time Stops
output:
text: >
In a world where time stops at night. The stars were twinkling and the
stars were twinkling. One day, a little girl named Lily came to visit
the house. She saw the stars and the moon. She asked her mom, "What is
that?" Her mom smiled and said, "That's a star. It is called a peony. It
has stars and lights and people who are very happy. Would you like to
try?" Lily nodded and said, "Yes please!" She grabbed a bright blue star
and a bright yellow star. She was so proud of her star. The next day,
Lily went to the park with her mom. She saw the stars and the moon. She
was so happy. She ran around the house with her mom and dad. They had a
wonderful time.
Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding
Model Details
Model Description
This is a custom language model trained on a dataset of short stories, designed for text generation tasks.
Architecture
Model Sources
- Repository: Aditya6122/BuildingLanguageModel-TinyStories
Uses
Direct Use
This model can be used for generating short stories and text completion tasks.
Downstream Use
Fine-tune the model on specific domains for specialized text generation.
Out-of-Scope Use
Not intended for production use without further validation.
Training Details
Training Data
The model was trained on the aditya-6122/tinystories-custom-dataset-18542-v2-test dataset.
Training Procedure
- Training Regime: Standard language model training with cross-entropy loss
- Epochs: 5
- Batch Size: 128
- Learning Rate: 0.001
- Optimizer: Adam (assumed)
- Hardware: Apple Silicon MPS (if available) or CPU
Tokenizer
The model uses the aditya-6122/tinystories-tokenizer-vb-18542-byte_level_bpe-v3-test tokenizer.
Model Architecture
- Architecture Type: RNN-based language model with GRU cells
- Embedding Dimension: 512
- Hidden Dimension: 1024
- Vocabulary Size: 18542
- Architecture Diagram: See
model_arch.jpgfor visual representation
Files
model.bin: The trained model weights in PyTorch format.tokenizer.json: The tokenizer configuration.model_arch.jpg: Architecture diagram showing the GRU model structure.
How to Use
Since this is a custom model, you'll need to load it using the provided code:
import torch
from your_language_model import LanguageModel # Replace with actual import
from tokenizers import Tokenizer
# Load tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
# Load model
vocab_size = tokenizer.get_vocab_size()
model = LanguageModel(vocab_size=vocab_size, embedding_dimension=512, hidden_dimension=1024)
model.load_state_dict(torch.load("model.bin"))
model.eval()
# Generate text
input_text = "Once upon a time"
# Tokenize and generate [Add your Generation Logic]
Limitations
- This is a basic RNN model and may not perform as well as transformer-based models.
- Trained on limited data, may exhibit biases from the training dataset.
- Not optimized for production deployment.
Ethical Considerations
Users should be aware of potential biases in generated text and use the model responsibly.
Citation
If you use this model, please cite:
@misc{vanilla-rnn-gru-like},
title={Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding},
author={Aditya Wath},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/aditya-6122/Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding}
}