| --- |
| language: en |
| license: mit |
| tags: |
| - language-model |
| - pytorch |
| - rnn |
| - text-generation |
| - gru |
| - tiny-stories |
| - bpe-tokenizer |
| datasets: |
| - aditya-6122/tinystories-custom-dataset-18542-v2-test |
| pipeline_tag: text-generation |
| widget: |
| - text: "Once there was a lonely robot." |
| example_title: "Robot Story" |
| output: |
| text: > |
| Once upon a time there was a lonely robot. He was very lonely and lonely. He wanted to be friends with everyone. He thought he would never be lonely. |
| One day, he decided to take a walk. He walked up to the robot and said, "Hello, robot! Can I be your friend?" The robot was very happy and said, "Yes, I would like that. I would like to be your friend." |
| The robot was very happy. He had been given a friend to the robot. He was so happy to have a friend. He would always be friends with the robot. |
| The robot was very happy. He had made a new friend. He was no longer lonely. He had a friend who could talk and play with him. The robot was so happy to have a friend. They played together every day and were never lonely again. |
| - text: "A child found a mysterious key." |
| example_title: "Mystery Key" |
| output: |
| text: > |
| A child found a mysterious key. The child was very curious and wanted to open it. He asked his mom for help, but her mom said no. She said that if he could open the lock, the key would lock the door open. |
| The child was sad and he didn't know what to do. He asked his mom for help. She said, "Let's try to unlock the door. It will be open and lock the door open." |
| So the child locked the door and locked the door. He was so excited to open it and see what was inside. He opened the door and saw the key. He opened it and inside was a big, bouncy ball. He was so excited and he ran to the ball. |
| He opened the door and saw the ball inside. He was so excited! He ran around the house with the key and the ball bounced around. He was so happy! |
| The child was so excited that he ran around the house with his key. He ran around the garden, playing with the key and the ball. He had so much fun that he forgot all about the key. |
| The moral of the story is that it is important to be social and obey the rules. |
| - text: "In a world where time stops at night." |
| example_title: "Time Stops" |
| output: |
| text: > |
| In a world where time stops at night. The stars were twinkling and the stars were twinkling. |
| One day, a little girl named Lily came to visit the house. She saw the stars and the moon. She asked her mom, "What is that?" |
| Her mom smiled and said, "That's a star. It is called a peony. It has stars and lights and people who are very happy. Would you like to try?" |
| Lily nodded and said, "Yes please!" She grabbed a bright blue star and a bright yellow star. She was so proud of her star. |
| The next day, Lily went to the park with her mom. She saw the stars and the moon. She was so happy. She ran around the house with her mom and dad. They had a wonderful time. |
| --- |
| |
| # Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding |
|
|
| ## Model Details |
|
|
| ### Model Description |
| This is a custom language model trained on a dataset of short stories, designed for text generation tasks. |
|
|
| ### Architecture |
| <img src="https://huggingface.co/aditya-6122/Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding/resolve/main/model_arch.jpg" alt="Architecture" width=1000 /> |
|
|
| ### Model Sources |
| - **Repository**: [Aditya6122/BuildingLanguageModel-TinyStories](https://github.com/Aditya6122/BuildingLanguageModel-TinyStories) |
|
|
| ## Uses |
|
|
| ### Direct Use |
| This model can be used for generating short stories and text completion tasks. |
|
|
| ### Downstream Use |
| Fine-tune the model on specific domains for specialized text generation. |
|
|
| ### Out-of-Scope Use |
| Not intended for production use without further validation. |
|
|
| ## Training Details |
|
|
| ### Training Data |
| The model was trained on the [aditya-6122/tinystories-custom-dataset-18542-v2-test](https://huggingface.co/datasets/aditya-6122/tinystories-custom-dataset-18542-v2-test) dataset. |
|
|
| ### Training Procedure |
| - **Training Regime**: Standard language model training with cross-entropy loss |
| - **Epochs**: 5 |
| - **Batch Size**: 128 |
| - **Learning Rate**: 0.001 |
| - **Optimizer**: Adam (assumed) |
| - **Hardware**: Apple Silicon MPS (if available) or CPU |
|
|
| ### Tokenizer |
| The model uses the [aditya-6122/tinystories-tokenizer-vb-18542-byte_level_bpe-v3-test](https://huggingface.co/aditya-6122/tinystories-tokenizer-vb-18542-byte_level_bpe-v3-test) tokenizer. |
|
|
| ### Model Architecture |
| - **Architecture Type**: RNN-based language model with GRU cells |
| - **Embedding Dimension**: 512 |
| - **Hidden Dimension**: 1024 |
| - **Vocabulary Size**: 18542 |
| - **Architecture Diagram**: See `model_arch.jpg` for visual representation |
|
|
| ## Files |
| - `model.bin`: The trained model weights in PyTorch format. |
| - `tokenizer.json`: The tokenizer configuration. |
| - `model_arch.jpg`: Architecture diagram showing the GRU model structure. |
|
|
| ## How to Use |
|
|
| Since this is a custom model, you'll need to load it using the provided code: |
|
|
| ```python |
| import torch |
| from your_language_model import LanguageModel # Replace with actual import |
| from tokenizers import Tokenizer |
| |
| # Load tokenizer |
| tokenizer = Tokenizer.from_file("tokenizer.json") |
| |
| # Load model |
| vocab_size = tokenizer.get_vocab_size() |
| model = LanguageModel(vocab_size=vocab_size, embedding_dimension=512, hidden_dimension=1024) |
| model.load_state_dict(torch.load("model.bin")) |
| model.eval() |
| |
| # Generate text |
| input_text = "Once upon a time" |
| |
| # Tokenize and generate [Add your Generation Logic] |
| ``` |
|
|
| ## Limitations |
| - This is a basic RNN model and may not perform as well as transformer-based models. |
| - Trained on limited data, may exhibit biases from the training dataset. |
| - Not optimized for production deployment. |
|
|
| ## Ethical Considerations |
| Users should be aware of potential biases in generated text and use the model responsibly. |
|
|
| ## Citation |
| If you use this model, please cite: |
| ``` |
| @misc{vanilla-rnn-gru-like}, |
| title={Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding}, |
| author={Aditya Wath}, |
| year={2024}, |
| publisher={Hugging Face}, |
| url={https://huggingface.co/aditya-6122/Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding} |
| } |
| ``` |