fn-aka-mur/wiki40b_ja
Viewer β’ Updated β’ 828k β’ 170 β’ 4
How to use oga5/hf-jp-gpt-wiki with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="oga5/hf-jp-gpt-wiki", trust_remote_code=True) # Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("oga5/hf-jp-gpt-wiki", trust_remote_code=True, dtype="auto")How to use oga5/hf-jp-gpt-wiki with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "oga5/hf-jp-gpt-wiki"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "oga5/hf-jp-gpt-wiki",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/oga5/hf-jp-gpt-wiki
How to use oga5/hf-jp-gpt-wiki with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "oga5/hf-jp-gpt-wiki" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "oga5/hf-jp-gpt-wiki",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "oga5/hf-jp-gpt-wiki" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "oga5/hf-jp-gpt-wiki",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use oga5/hf-jp-gpt-wiki with Docker Model Runner:
docker model run hf.co/oga5/hf-jp-gpt-wiki
This repository contains a hyper-small Japanese GPT model exported in a Hugging Face-compatible layout, with a vendored backbone and SentencePiece tokenizer.
jp_tok_wiki.model, jp_tok_wiki.vocab)trust_remote_code=Truefrom transformers import AutoModelForCausalLM
import torch
import sentencepiece as spm
# Load model (trust_remote_code is required)
model = AutoModelForCausalLM.from_pretrained(
"oga5/hf-jp-gpt-wiki", # or local folder path
trust_remote_code=True
)
model.eval()
# Load SentencePiece tokenizer
sp = spm.SentencePieceProcessor(model_file="jp_tok_wiki.model") # if local
# If running from the Hub, download the files and reference their path, or use hf_hub_download
# from huggingface_hub import hf_hub_download
# tok_path = hf_hub_download("oga5/hf-jp-gpt-wiki", filename="jp_tok_wiki.model")
# sp = spm.SentencePieceProcessor(model_file=tok_path)
eos_id = sp.eos_id()
prompt = "γγγ«γ‘γ―γζθΏγγ£γι’η½γγγ¨γ―γ"
input_ids = sp.encode(prompt, out_type=int)
input_ids = torch.tensor([input_ids], dtype=torch.long)
max_new_tokens = 50
ctx = model.config.context_length
with torch.no_grad():
for _ in range(max_new_tokens):
idx_cond = input_ids[:, -ctx:]
out = model(input_ids=idx_cond)
logits = out["logits"] if isinstance(out, dict) else out.logits
next_id = torch.argmax(logits[:, -1, :], dim=-1, keepdim=True)
if next_id.item() == eos_id:
break
input_ids = torch.cat([input_ids, next_id], dim=1)
print(sp.decode(input_ids[0].tolist()))
modeling_custom_gpt.py) so it can be loaded from the Hub without external project files.AutoTokenizer is not provided. You can load SentencePiece directly as shown above.model(...).If you encounter OSError: Not found: "jp_tok_wiki.model" when running the sample, make sure you pass an existing file path to SentencePiece. Here are reliable patterns:
sample/sample.py under .../llmtest01/sample/):import os
import torch
import sentencepiece as spm
from transformers import AutoModelForCausalLM
# Resolve the repo dir relative to this script file
BASE_DIR = os.path.dirname(os.path.abspath(__file__)) # points to sample/
repo_dir = os.path.normpath(os.path.join(BASE_DIR, "..", "hf_jp_gpt_wiki"))
spm_path = os.path.join(repo_dir, "jp_tok_wiki.model")
print("SPM path:", spm_path, "exists?", os.path.exists(spm_path))
model = AutoModelForCausalLM.from_pretrained(repo_dir, trust_remote_code=True)
model.eval()
sp = spm.SentencePieceProcessor(model_file=spm_path)
hf_hub_download:import torch
import sentencepiece as spm
from transformers import AutoModelForCausalLM
from huggingface_hub import hf_hub_download
repo_id = "oga5/hf-jp-gpt-wiki"
model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True)
model.eval()
# Download the tokenizer model and pass the absolute path to SentencePiece
spm_path = hf_hub_download(repo_id=repo_id, filename="jp_tok_wiki.model")
print("Downloaded SPM path:", spm_path)
sp = spm.SentencePieceProcessor(model_file=spm_path)
Tip: print the current working directory and directory listings to verify paths:
import os
print("CWD:", os.getcwd())
print("Here:", os.listdir("."))
@inproceedings{guo-etal-2020-wiki,
title = "{W}iki-40{B}: Multilingual Language Model Dataset",
author = "Guo, Mandy and
Dai, Zihang and
Vrande{\v{c}}i{\'c}, Denny and
Al-Rfou, Rami",
booktitle = "Proceedings of the Twelfth Language Resources and Evaluation Conference",
month = may,
year = "2020",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://aclanthology.org/2020.lrec-1.297",
pages = "2440--2452",
abstract = "We propose a new multilingual language model benchmark that is composed of 40+ languages spanning several scripts and linguistic families. With around 40 billion characters, we hope this new resource will accelerate the research of multilingual modeling. We train monolingual causal language models using a state-of-the-art model (Transformer-XL) establishing baselines for many languages. We also introduce the task of multilingual causal language modeling where we train our model on the combined text of 40+ languages from Wikipedia with different vocabulary sizes and evaluate on the languages individually. We released the cleaned-up text of 40+ Wikipedia language editions, the corresponding trained monolingual language models, and several multilingual language models with different fixed vocabulary sizes.",
language = "English",
ISBN = "979-10-95546-34-4",
}
If you use this model, please consider citing the original book/code and this repository.