--- language: - zh - en library_name: pytorch tags: - translation - transformer - tiny-llm - zh-en pipeline_tag: translation --- # Tiny-LLM (ZH→EN) Checkpoint Minimal Transformer encoder–decoder for Chinese → English translation. This repository hosts the inference assets (checkpoint and tokenizer) usable in Python or Gradio apps. ## Files - `translate-step=290000.ckpt` — PyTorch state_dict checkpoint (Lightning-format state under `state_dict`) - `tokenizer.json` — Hugging Face Tokenizers (BPE) with special tokens `[UNK]`, `[PAD]`, `[SOS]`, `[EOS]` ## Quick start Load files using `huggingface_hub` and run with your own model code: ```python import torch from huggingface_hub import hf_hub_download from tokenizers import Tokenizer # Replace with your repo id if you fork REPO_ID = "caixiaoshun/tiny-llm-zh2en" ckpt_path = hf_hub_download(repo_id=REPO_ID, filename="translate-step=290000.ckpt") tokenizer_path = hf_hub_download(repo_id=REPO_ID, filename="tokenizer.json") # Example: integrate with a minimal Transformer implementation # from src.config import Config # from src.model import TranslateModel # config = Config() # config.tokenizer_file = tokenizer_path # model = TranslateModel(config) # state = torch.load(ckpt_path, map_location="cpu")["state_dict"] # # Strip potential Lightning/compile prefixes # prefix = "net._orig_mod." # state = { (k[len(prefix):] if k.startswith(prefix) else k): v for k, v in state.items() } # model.load_state_dict(state, strict=True) # model.eval() # tokenizer = Tokenizer.from_file(tokenizer_path) ``` If you deploy on Hugging Face Spaces or ModelScope, set environment variables to make your app fetch from this repo: ```bash export HF_REPO_ID=caixiaoshun/tiny-llm-zh2en export CKPT_FILE=translate-step=290000.ckpt export TOKENIZER_FILE=tokenizer.json ``` ## Notes - Trained on a Chinese→English parallel dataset (CSV layout with ZH at column 0 and EN at column 1). Ensure the tokenizer and model hyperparameters match your training run. - Decoding strategies supported in the reference app: greedy, nucleus (top-p), and beam search. ## Intended use - Educational and demo purposes for small-scale translation tasks. - Not intended for production-grade translation quality without further training/finetuning and evaluation. ## Limitations - Small model capacity; outputs may be inaccurate or inconsistent on complex inputs. - Tokenizer and checkpoint must match; mismatches lead to degraded results or load errors. ## Acknowledgements - PyTorch for the deep learning framework - Hugging Face Tokenizers for fast BPE