aditya-6122/tinystories-custom-dataset-18542-v2-test
Viewer • Updated • 120k • 22
This is a custom language model trained on a dataset of short stories, designed for text generation tasks.
This model can be used for generating short stories and text completion tasks.
Fine-tune the model on specific domains for specialized text generation.
Not intended for production use without further validation.
The model was trained on the aditya-6122/tinystories-custom-dataset-18542-v2-test dataset.
The model uses the aditya-6122/tinystories-tokenizer-vb-18542-byte_level_bpe-v3-test tokenizer.
model_arch.jpg for visual representationmodel.bin: The trained model weights in PyTorch format.tokenizer.json: The tokenizer configuration.model_arch.jpg: Architecture diagram showing the GRU model structure.Since this is a custom model, you'll need to load it using the provided code:
import torch
from your_language_model import LanguageModel # Replace with actual import
from tokenizers import Tokenizer
# Load tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
# Load model
vocab_size = tokenizer.get_vocab_size()
model = LanguageModel(vocab_size=vocab_size, embedding_dimension=512, hidden_dimension=1024)
model.load_state_dict(torch.load("model.bin"))
model.eval()
# Generate text
input_text = "Once upon a time"
# Tokenize and generate [Add your Generation Logic]
Users should be aware of potential biases in generated text and use the model responsibly.
If you use this model, please cite:
@misc{vanilla-rnn-gru-like},
title={Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding},
author={Aditya Wath},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/aditya-6122/Tiny-Stories-GRU-LanguageModel-ByteLevelEncoding}
}