Instructions to use google/reformer-enwik8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/reformer-enwik8 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="google/reformer-enwik8")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("google/reformer-enwik8") model = AutoModelForCausalLM.from_pretrained("google/reformer-enwik8") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/reformer-enwik8 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/reformer-enwik8" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/reformer-enwik8", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/google/reformer-enwik8
- SGLang
How to use google/reformer-enwik8 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/reformer-enwik8" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/reformer-enwik8", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/reformer-enwik8" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/reformer-enwik8", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use google/reformer-enwik8 with Docker Model Runner:
docker model run hf.co/google/reformer-enwik8
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Reformer Language model on character level and trained on enwik8.
enwik8 is a dataset based on Wikipedia and is often used to measure the model's ability to compress data, e.g. in the scope of the Hutter prize: https://en.wikipedia.org/wiki/Hutter_Prize.
reformer-enwik8 was pretrained on the first 90M chars of enwik8 whereas the text was chunked into batches of size 65536 chars (=2^16).
The model's weights were taken from https://console.cloud.google.com/storage/browser/trax-ml/reformer/enwik8 and converted
to Hugging Face's PyTorch ReformerLM model ReformerModelWithLMHead.
The model is a language model that operates on characters. Therefore, this model does not need a tokenizer. The following function can instead be used for encoding and decoding:
import torch
# Encoding
def encode(list_of_strings, pad_token_id=0):
max_length = max([len(string) for string in list_of_strings])
# create emtpy tensors
attention_masks = torch.zeros((len(list_of_strings), max_length), dtype=torch.long)
input_ids = torch.full((len(list_of_strings), max_length), pad_token_id, dtype=torch.long)
for idx, string in enumerate(list_of_strings):
# make sure string is in byte format
if not isinstance(string, bytes):
string = str.encode(string)
input_ids[idx, :len(string)] = torch.tensor([x + 2 for x in string])
attention_masks[idx, :len(string)] = 1
return input_ids, attention_masks
# Decoding
def decode(outputs_ids):
decoded_outputs = []
for output_ids in outputs_ids.tolist():
# transform id back to char IDs < 2 are simply transformed to ""
decoded_outputs.append("".join([chr(x - 2) if x > 1 else "" for x in output_ids]))
return decoded_outputs
Text can be generated as follows:
from transformers import ReformerModelWithLMHead
model = ReformerModelWithLMHead.from_pretrained("google/reformer-enwik8")
encoded, attention_masks = encode(["In 1965, Brooks left IBM to found the Department of"])
decode(model.generate(encoded, do_sample=True, max_length=150))
# gives:
# In 1965, Brooks left IBM to found the Department of Journalism in 1968. IBM had jurisdiction himself in 1980, while Brooks resolved, nevertheless thro
Note: Language generation using ReformerModelWithLMHead is not optimized yet and is rather slow.
- Downloads last month
- 91,052