| | --- |
| | language: en |
| | license: mit |
| | library_name: pytorch |
| | tags: |
| | - text-generation |
| | - gpt |
| | - transformers |
| | - language-model |
| | - alice-in-wonderland |
| | - literature |
| | datasets: |
| | - alice-in-wonderland |
| | metrics: |
| | - perplexity |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # 1st Demo GPT Based Architecture Model |
| |
|
| | ## Model Description |
| |
|
| | This is a **GPT-based transformer language model** trained from scratch on Lewis Carroll's "Alice's Adventures in Wonderland". This model demonstrates a custom implementation of the GPT architecture for text generation tasks, specifically fine-tuned on classic literature. |
| |
|
| | ## Model Details |
| |
|
| | - **Model Type**: GPT (Generative Pre-trained Transformer) |
| | - **Architecture**: Custom transformer-based language model |
| | - **Training Data**: Alice's Adventures in Wonderland by Lewis Carroll |
| | - **Language**: English |
| | - **Library**: PyTorch |
| | - **Model Size**: ~4.2M parameters (based on complete_gpt_model.pth) |
| |
|
| | ## Training Details |
| |
|
| | ### Dataset |
| | - **Source**: Alice's Adventures in Wonderland (complete text) |
| | - **Size**: 1,033 lines of text |
| | - **Preprocessing**: Custom tokenization using character-level or subword tokenization |
| |
|
| | ### Training Configuration |
| | - **Epochs**: 3 (checkpoint files available for each epoch) |
| | - **Optimizer**: Likely AdamW (standard for transformer models) |
| | - **Training Files**: |
| | - `checkpoint_epoch_1.pth` (12.2MB) |
| | - `checkpoint_epoch_2.pth` (12.2MB) |
| | - `checkpoint_epoch_3.pth` (12.2MB) |
| | - `best_model.pth` (4.14MB) - Best performing checkpoint |
| | - `complete_gpt_model.pth` (4.20MB) - Final trained model |
| |
|
| | ## Files in this Repository |
| |
|
| | | File | Size | Description | |
| | |------|------|-------------| |
| | | `complete_gpt_model.pth` | 4.20MB | Final trained model weights | |
| | | `best_model.pth` | 4.14MB | Best performing model checkpoint | |
| | | `checkpoint_epoch_1.pth` | 12.2MB | Training checkpoint after epoch 1 | |
| | | `checkpoint_epoch_2.pth` | 12.2MB | Training checkpoint after epoch 2 | |
| | | `checkpoint_epoch_3.pth` | 12.2MB | Training checkpoint after epoch 3 | |
| | | `tokenizer.pkl` | 37.3KB | Custom tokenizer for the model | |
| | | `dataset.txt` | 51KB | Training dataset (Alice in Wonderland) | |
| | | `Notebook1.ipynb` | 4.1MB | Training notebook with implementation | |
| |
|
| | ## Usage |
| |
|
| | ### Loading the Model |
| |
|
| | ```python |
| | import torch |
| | import pickle |
| | |
| | # Load the tokenizer |
| | with open('tokenizer.pkl', 'rb') as f: |
| | tokenizer = pickle.load(f) |
| | |
| | # Load the model |
| | model = torch.load('complete_gpt_model.pth', map_location='cpu') |
| | model.eval() |
| | ``` |
| |
|
| | ### Text Generation |
| |
|
| | ```python |
| | def generate_text(model, tokenizer, prompt, max_length=100): |
| | model.eval() |
| | with torch.no_grad(): |
| | # Tokenize input |
| | input_ids = tokenizer.encode(prompt) |
| | |
| | # Generate text |
| | for _ in range(max_length): |
| | # Your generation logic here |
| | # This will depend on your specific implementation |
| | pass |
| | |
| | return generated_text |
| | |
| | # Example usage |
| | prompt = "Alice was beginning to get very tired" |
| | generated = generate_text(model, tokenizer, prompt) |
| | print(generated) |
| | ``` |
| |
|
| | ## Model Performance |
| |
|
| | The model has been trained for 3 epochs on the Alice in Wonderland dataset. Performance metrics and loss curves can be found in the training notebook (`Notebook1.ipynb`). |
| |
|
| | ### Expected Outputs |
| | Given the training on Alice in Wonderland, the model should generate text in a similar style to Lewis Carroll's writing, with: |
| | - Victorian-era English vocabulary and sentence structure |
| | - Whimsical and fantastical content |
| | - Character references from the original story |
| | - Descriptive and narrative prose style |
| |
|
| | ## Training Process |
| |
|
| | The training was conducted using: |
| | 1. **Data Preprocessing**: Text cleaning and tokenization |
| | 2. **Model Architecture**: Custom GPT implementation |
| | 3. **Training Loop**: 3 epochs with checkpoint saving |
| | 4. **Validation**: Best model selection based on validation metrics |
| |
|
| | ## Limitations |
| |
|
| | - **Dataset Size**: Trained on a single book, limiting vocabulary and style diversity |
| | - **Domain Specificity**: Optimized for Lewis Carroll's writing style |
| | - **Scale**: Relatively small model compared to modern large language models |
| | - **Context Length**: Limited context window typical of smaller transformer models |
| |
|
| | ## Ethical Considerations |
| |
|
| | - This model is trained on public domain literature (Alice in Wonderland) |
| | - The training data is from 1865 and may contain outdated language or concepts |
| | - The model is intended for educational and demonstration purposes |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite: |
| |
|
| | ```bibtex |
| | @misc{karthik2024alice_gpt, |
| | title={1st Demo GPT Based Architecture Model}, |
| | author={Karthik}, |
| | year={2024}, |
| | howpublished={Hugging Face Model Hub}, |
| | url={https://huggingface.co/karthik-2905/1st_Demo_GPT_Based_Architecture_Model} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This model is released under the MIT License. The training data (Alice's Adventures in Wonderland) is in the public domain. |
| |
|
| | ## Contact |
| |
|
| | For questions or issues, please open an issue in this repository or contact the model author. |
| |
|
| | --- |
| |
|
| | *This model was created as a learning exercise to demonstrate GPT architecture implementation and training on classic literature.* |