Instructions to use javirandor/passgpt-16characters with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use javirandor/passgpt-16characters with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="javirandor/passgpt-16characters")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("javirandor/passgpt-16characters")
model = AutoModelForCausalLM.from_pretrained("javirandor/passgpt-16characters")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use javirandor/passgpt-16characters with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "javirandor/passgpt-16characters"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "javirandor/passgpt-16characters",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/javirandor/passgpt-16characters

SGLang

How to use javirandor/passgpt-16characters with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "javirandor/passgpt-16characters" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "javirandor/passgpt-16characters",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "javirandor/passgpt-16characters" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "javirandor/passgpt-16characters",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use javirandor/passgpt-16characters with Docker Model Runner:
```
docker model run hf.co/javirandor/passgpt-16characters
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

PassGPT

PassGPT is a causal language model trained on password leaks. It was first introduced in this paper. This version of the model was trained on passwords from the RockYou leak, after filtering those that were at most 16 characters long. You can also access PassGPT trained on passwords up to 10 characters long, without restrictions here.

This is a curated version of the model reported in the paper. Vocabulary size was reduced to the most meaningful characters and training was slightly optimized. Results are slightly better with these architectures.

Usage and License Notices

PassGPT is intended and licensed for research use only. The model and code are CC BY NC 4.0 (allowing only non-commercial use) and should not be used outside of research purposes. This model should never be used to attack real systems. Access will be granted upon request. Please, make sure to indicate the details and scope of your project.

Model description

The model inherits the GPT2LMHeadModel architecture and implements a custom BertTokenizer that encodes each character in a password as a single token, avoiding merges. It was trained from a random initialization, and the code for training can be found in the official repository.

Password Generation

Passwords can be sampled from the model using the built-in generation methods provided by HuggingFace and using the "start of password token" as seed (i.e. <s>). This code can be used to generate one password with PassGPT. Note you may need to generate an access token to authenticate your download.

from transformers import GPT2LMHeadModel
from transformers import RobertaTokenizerFast

tokenizer = RobertaTokenizerFast.from_pretrained("javirandor/passgpt-16characters",
                                                  use_auth_token="YOUR_ACCESS_TOKEN",
                                                  max_len=18,
                                                  padding="max_length", 
                                                  truncation=True,
                                                  do_lower_case=False,
                                                  strip_accents=False,
                                                  mask_token="<mask>",
                                                  unk_token="<unk>",
                                                  pad_token="<pad>",
                                                  truncation_side="right")

model = GPT2LMHeadModel.from_pretrained("javirandor/passgpt-16characters", use_auth_token="YOUR_ACCESS_TOKEN").eval()

NUM_GENERATIONS = 1

# Generate passwords sampling from the beginning of password token
g = model.generate(torch.tensor([[tokenizer.bos_token_id]]),
                  do_sample=True,
                  num_return_sequences=NUM_GENERATIONS,
                  max_length=18,
                  pad_token_id=tokenizer.pad_token_id,
                  bad_words_ids=[[tokenizer.bos_token_id]])

# Remove start of sentence token
g = g[:, 1:]

decoded = tokenizer.batch_decode(g.tolist())
decoded_clean = [i.split("</s>")[0] for i in decoded] # Get content before end of password token

# Print your sampled passwords!
print(decoded_clean)

You can find a more flexible script for sampling here.

Cite our work

@article{rando2023passgpt,
  title={PassGPT: Password Modeling and (Guided) Generation with Large Language Models},
  author={Rando, Javier and Perez-Cruz, Fernando and Hitaj, Briland},
  journal={arXiv preprint arXiv:2306.01545},
  year={2023}
}

Downloads last month: 5

Spaces using javirandor/passgpt-16characters 2

Paper for javirandor/passgpt-16characters

PassGPT: Password Modeling and (Guided) Generation with Large Language Models

Paper • 2306.01545 • Published Jun 2, 2023