Coobyk
/

EpsteinGPT

Model card Files Files and versions

EpsteinGPT / README.md

Coobyk's picture

Upload EpsteinGPT V1

4a88de0 verified 6 months ago

|

history blame contribute delete

2 kB

	# EpsteinGPT - Minimal GPT Model

	This repository contains a Minimal GPT (MVT) model trained on the Epstein email threads dataset.

	## Model Details

	This is a custom-built Causal Transformer model (`MinimalGPT`) inspired by nanoGPT/minGPT architectures. It was trained from scratch using a custom Byte-Pair Encoding (BPE) tokenizer.

	### Configuration (`config.json`)
	```json
	{
	"vocab_size": 5000,
	"block_size": 256,
	"n_layer": 8,
	"n_head": 8,
	"n_embd": 512,
	"batch_size": 16,
	"dropout": 0.1,
	"bias": false
	}
	```

	## Files Included

	* `epsteingpt_tokenizer.json`: The custom BPE tokenizer used for encoding and decoding text.
	* `EpsteinGPT.pt`: The PyTorch checkpoint containing the trained model weights.
	* `EpsteinGPT.ptl`: The TorchScript Lite version of the trained model, optimized for deployment.
	* `model.py`: Defines the `MVTConfig` class and the `MinimalGPT` model architecture.
	* `config.json`: Model configuration in JSON format.
	* `README.md`: This file.

	## How to Use

	To use this model, you would typically:

	1. Load the tokenizer:
	```python
	from tokenizers import Tokenizer
	tokenizer = Tokenizer.from_file("epsteingpt_tokenizer.json")
	```
	2. Load the model architecture and configuration (from `model.py` and `config.json`).
	3. Load the trained weights from `EpsteinGPT.pt` into the model.
	4. Use the model for text generation or other tasks.

	For generation, you can refer to the `generate.py` script used during development.

	## Training

	The model was trained on a dataset of Epstein email threads. The training process involved:

	1. Tokenizer Training: A BPE tokenizer was trained on the raw text data.
	2. Data Preparation: The text data was tokenized and converted into a numerical format.
	3. Model Training: The `MinimalGPT` model was trained using a custom training loop.

	## Further Information

	For more details on the model architecture and training process, refer to the `model.py` and `train.py` scripts.