Instructions to use VLAI-AIVN/vigpt2-aio with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use VLAI-AIVN/vigpt2-aio with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="VLAI-AIVN/vigpt2-aio")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("VLAI-AIVN/vigpt2-aio") model = AutoModelForCausalLM.from_pretrained("VLAI-AIVN/vigpt2-aio") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use VLAI-AIVN/vigpt2-aio with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "VLAI-AIVN/vigpt2-aio" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VLAI-AIVN/vigpt2-aio", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/VLAI-AIVN/vigpt2-aio
- SGLang
How to use VLAI-AIVN/vigpt2-aio with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "VLAI-AIVN/vigpt2-aio" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VLAI-AIVN/vigpt2-aio", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "VLAI-AIVN/vigpt2-aio" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "VLAI-AIVN/vigpt2-aio", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use VLAI-AIVN/vigpt2-aio with Docker Model Runner:
docker model run hf.co/VLAI-AIVN/vigpt2-aio
ViGPT2 AIO
ViGPT2 AIO is a Vietnamese GPT-2 style causal language model pretrained for open-ended text generation and general Vietnamese language modeling.
This model was developed to support teaching and hands-on practice in the AIO course, while also serving as a general Vietnamese pretrained language model for experimentation and downstream adaptation.
Model Details
- Model name: ViGPT2 AIO
- Architecture: GPT-2
- Task: Causal language modeling / text generation
- Language: Vietnamese
- Library: Transformers
- Weights format: Safetensors
Training Data
The model was pretrained on a mixture of Vietnamese news and Vietnamese Wikipedia text.
Data Sources
- BKAINewsCorpus from
bkai-foundation-models/BKAINewsCorpus - Vietnamese Wikipedia collected through a custom crawling pipeline, then cleaned from raw wikitext into plain text
Data Processing
Before pretraining, the corpora were cleaned and deduplicated.
- The tokenizer was trained on the raw corpora:
bkai_train.parquetvi_wiki_articles_clean.parquet
- The language model was pretrained on deduplicated versions of these corpora.
- In the final training mixture, the Vietnamese Wikipedia corpus was upweighted relative to the news corpus.
Training Mixture
The pretraining mixture used:
bkai_train.parquetwith weight 1vi_wiki_articles_clean.parquetwith weight 3
Training Objective
The model was pretrained with a standard causal language modeling objective, where the model learns to predict the next token in a sequence.
Limitations
- The model may generate incorrect, nonsensical, or fabricated content.
- Outputs can reflect biases or artifacts present in the pretraining data.
- The model is not a verified factual source and should not be used without human validation in high-stakes settings.
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
repo_id = "VLAI-AIVN/vigpt2-aio"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.config.pad_token_id = tokenizer.pad_token_id
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device).eval()
prompt = "Việt Nam là một đất nước"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=80,
do_sample=True,
temperature=0.8,
top_p=0.9,
repetition_penalty=1.1,
pad_token_id=tokenizer.pad_token_id,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 56