Spaces:

sidharthg
/

ShakespeareGPT

Sleeping

App Files Files Community

sidharthg commited on Nov 15, 2025

Commit

861a64c

verified ·

1 Parent(s): af12d02

Upload 7 files

Browse files

Files changed (7) hide show

README.md +50 -12
app.py +24 -0
models/best_model.pt +3 -0
requirements.txt +4 -0
scripts/gpt2_train_per.py +1 -0
src/inference.py +21 -0
src/utils.py +41 -0

README.md CHANGED Viewed

@@ -1,12 +1,50 @@
----
-title: ShakespeareGPT
-emoji: 📚
-colorFrom: red
-colorTo: indigo
-sdk: gradio
-sdk_version: 5.49.1
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# GPT-2 Spaces Application
+This project is a Hugging Face Spaces application that utilizes a trained GPT-2 model to generate text based on user input. The application provides an interactive interface for users to input prompts and receive generated outputs.
+## Project Structure
+- `app.py`: Main entry point for the application, setting up the Gradio interface.
+- `requirements.txt`: Lists the dependencies required for the project.
+- `README.md`: Documentation for the project, including setup and usage instructions.
+- `.gitignore`: Specifies files and directories to be ignored by Git.
+- `models/best_model.pt`: Contains the weights of the best model saved during training.
+- `src/inference.py`: Logic for loading the trained model and generating text.
+- `src/utils.py`: Utility functions for tasks such as tokenization and decoding.
+- `scripts/gpt2_train_per.py`: Training script for the GPT model.
+## Setup Instructions
+1. **Clone the Repository**
+   ```bash
+   git clone <repository-url>
+   cd gpt2-spaces-app
+   ```
+2. **Install Dependencies**
+   Make sure you have Python installed, then install the required packages:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Run the Application**
+   Start the application using:
+   ```bash
+   python app.py
+   ```
+4. **Access the Application**
+   Open your web browser and go to `http://localhost:7860` to interact with the model.
+## Usage
+- Enter a prompt in the input box and click the "Generate" button.
+- The model will generate text based on the provided prompt and display the output below.
+## Model Training
+The model can be retrained using the script located in `scripts/gpt2_train_per.py`. Ensure you have the necessary data and adjust the training parameters as needed.
+## License
+This project is licensed under the MIT License. See the LICENSE file for more details.

app.py ADDED Viewed

	@@ -0,0 +1,24 @@

+from flask import Flask, request, jsonify
+import torch
+from transformers import GPT2LMHeadModel, GPT2Tokenizer
+app = Flask(__name__)
+# Load the model and tokenizer
+model = GPT2LMHeadModel.from_pretrained('./models/best_model.pt')
+tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
+model.eval()
+@app.route('/generate', methods=['POST'])
+def generate_text():
+    input_text = request.json.get('input_text', '')
+    inputs = tokenizer.encode(input_text, return_tensors='pt')
+    with torch.no_grad():
+        outputs = model.generate(inputs, max_length=30, num_return_sequences=1)
+    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
+    return jsonify({'generated_text': generated_text})
+if __name__ == '__main__':
+    app.run(debug=True)

models/best_model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c30cc8d0758ccf4154a7857ae971917f379a2b781a4149c88c3b2d1bc654a452
+size 40

requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+torch==1.13.0
+transformers==4.21.1
+gradio==3.1.4
+tiktoken==0.2.0

scripts/gpt2_train_per.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # This file is intentionally left blank.

src/inference.py ADDED Viewed

	@@ -0,0 +1,21 @@

+from transformers import GPT2LMHeadModel, GPT2Tokenizer
+import torch
+class GPT2Inference:
+    def __init__(self, model_path):
+        self.tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
+        self.model = GPT2LMHeadModel.from_pretrained(model_path)
+        self.model.eval()
+    def generate_text(self, prompt, max_length=30, num_return_sequences=1):
+        input_ids = self.tokenizer.encode(prompt, return_tensors='pt')
+        with torch.no_grad():
+            outputs = self.model.generate(input_ids, max_length=max_length, num_return_sequences=num_return_sequences)
+        return [self.tokenizer.decode(output, skip_special_tokens=True) for output in outputs]
+def load_model():
+    model_path = 'models/best_model.pt'
+    inference_model = GPT2Inference(model_path)
+    return inference_model
+inference_model = load_model()

src/utils.py ADDED Viewed

	@@ -0,0 +1,41 @@

+import torch
+import tiktoken
+def load_model(model_path):
+    """Load the trained model from the specified path."""
+    from src.inference import GPT
+    from src.utils import GPTConfig
+    config = GPTConfig()
+    model = GPT(config)
+    model.load_state_dict(torch.load(model_path))
+    model.eval()
+    return model
+def tokenize_input(text):
+    """Tokenize the input text using the GPT-2 tokenizer."""
+    enc = tiktoken.get_encoding('gpt2')
+    tokens = enc.encode(text)
+    return torch.tensor(tokens).unsqueeze(0)  # Add batch dimension
+def decode_output(tokens):
+    """Decode the generated tokens back to text."""
+    enc = tiktoken.get_encoding('gpt2')
+    return enc.decode(tokens.tolist())
+def generate_text(model, input_text, max_length=30):
+    """Generate text using the trained model based on the input text."""
+    input_tokens = tokenize_input(input_text)
+    generated_tokens = input_tokens
+    while generated_tokens.size(1) < max_length:
+        with torch.no_grad():
+            logits = model(generated_tokens)[0]
+            logits = logits[:, -1, :]
+            probs = torch.softmax(logits, dim=-1)
+            topk_probs, topk_indices = torch.topk(probs, 50, dim=-1)
+            ix = torch.multinomial(topk_probs, 1)
+            xcol = torch.gather(topk_indices, -1, ix)
+            generated_tokens = torch.cat((generated_tokens, xcol), dim=1)
+    return decode_output(generated_tokens[0])  # Return the decoded output for the first sequence