Instructions to use erwanf/gpt2-mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use erwanf/gpt2-mini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="erwanf/gpt2-mini")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("erwanf/gpt2-mini") model = AutoModelForCausalLM.from_pretrained("erwanf/gpt2-mini") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use erwanf/gpt2-mini with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "erwanf/gpt2-mini" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "erwanf/gpt2-mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/erwanf/gpt2-mini
- SGLang
How to use erwanf/gpt2-mini with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "erwanf/gpt2-mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "erwanf/gpt2-mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "erwanf/gpt2-mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "erwanf/gpt2-mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use erwanf/gpt2-mini with Docker Model Runner:
docker model run hf.co/erwanf/gpt2-mini
GPT-2 Mini
A smaller GPT-2 model with (only) 39M parameters. It was pretrained on a subset of OpenWebText, the open-source version of the pretraining dataset used by OpenAI for the original GPT-2 models.
Uses
The purpose of this model is mainly for research and education. Its small size allows for fast experiments in resource-limited settings, while still being able of generating complex and coherent text.
Getting Started
Use the code below to get started with the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model
model = AutoModelForCausalLM.from_pretrained("erwanf/gpt2-mini")
model.eval()
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("erwanf/gpt2-mini")
# Generate text
prompt = "Hello, I'm a language model,"
input_ids = tokenizer.encode(prompt, return_tensors="pt")
output = model.generate(input_ids, do_sample=True, max_length=50, num_return_sequences=5)
output_text = tokenizer.batch_decode(output, skip_special_tokens=True)
print(output_text)
Output:
["Hello, I'm a language model, I can't be more efficient in words.\n\nYou can use this as a point to find out the next bit in your system, and learn more about me.\n\nI think a lot of the",
"Hello, I'm a language model, my teacher is a good teacher - a good school teacher – and one thing you have to remember:\n\nIt's not perfect. A school is not perfect; it isn't perfect at all!\n\n",
'Hello, I\'m a language model, but if I can do something for you then go for it (for a word). Here is my blog, the language:\n\nI\'ve not used "normal" in English words, but I\'ve always',
'Hello, I\'m a language model, I\'m talking to you the very first time I used a dictionary and it can be much better than one word in my dictionary. What would an "abnormal" English dictionary have to do with a dictionary and',
'Hello, I\'m a language model, the most powerful representation of words and phrases in the language I\'m using."\n\nThe new rules change that makes it much harder for people to understand a language that does not have a native grammar (even with']
Training Details
The architecture relies on the GPT-2 model, with smaller dimensions and less layers. It uses the same tokenizer as GPT-2. We used the first 2M rows from the OpenWebText dataset, out of which we use 1k for test and validation sets.
Hyperparameters
| Hyperparameter | Value |
|---|---|
| Model Parameters | |
| Vocabulary Size | 50,257 |
| Context Length | 512 |
| Number of Layers | 4 |
| Hidden Size | 512 |
| Number of Attention Heads | 8 |
| Intermediate Size | 2048 |
| Activation Function | GELU |
| Dropout | No |
| Training Parameters | |
| Learning Rate | 5e-4 |
| Batch Size | 256 |
| Optimizer | AdamW |
| beta1 | 0.9 |
| beta2 | 0.98 |
| Weight Decay | 0.1 |
| Training Steps | 100,000 |
| Warmup Steps | 4,000 |
| Learning Rate Scheduler | Cosine |
| Training Dataset Size | 1M samples |
| Validation Dataset Size | 1k samples |
| Float Type | bf16 |
- Downloads last month
- 85,097
docker model run hf.co/erwanf/gpt2-mini