Text Generation
Safetensors
English
Khmer
customs
hs-code
classification
cambodia
gemma
unsloth
qlora
conversational
Instructions to use Sothay/gemma4-hscode-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use Sothay/gemma4-hscode-classifier with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Sothay/gemma4-hscode-classifier to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Sothay/gemma4-hscode-classifier to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Sothay/gemma4-hscode-classifier to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Sothay/gemma4-hscode-classifier", max_seq_length=2048, )
| license: gemma | |
| pipeline_tag: text-generation | |
| language: | |
| - en | |
| - km | |
| tags: | |
| - customs | |
| - hs-code | |
| - classification | |
| - cambodia | |
| - gemma | |
| - unsloth | |
| - qlora | |
| base_model: | |
| - unsloth/gemma-4-E4B-it | |
| # Gemma‑4 HS Code Classifier (Cambodia Customs) | |
| A **Gemma‑4‑E4B‑it** model fine‑tuned with QLoRA to classify product descriptions into **8‑digit HS codes** and return corresponding Cambodian trade rates (Customs Duty, Special Tax, VAT, Excise Tax). | |
| Built with **[Unsloth](https://github.com/unslothai/unsloth)** for fast, memory‑efficient fine‑tuning on a single T4 GPU. | |
| --- | |
| ## 🎯 What it does | |
| Given a plain‑English product description, the model generates: | |
| ```text | |
| HS Code: 61091000 | |
| Unit: PIECE | |
| Customs Duty: 25% | |
| Special Tax: 0% | |
| VAT: 10% | |
| Excise Tax: 0% | |
| ``` | |
| **⚠️ Important**: The rates in the text are generated by the model and **may be wrong**. | |
| For production, always use the included **lookup table** (`hs_code_lookup.json`) – see [Production use](#-production-use) below. | |
| --- | |
| ## 🚀 Quick start (in Colab or locally) | |
| This repository contains **only the LoRA adapter**, not the full model. | |
| Loading it will automatically download the base model (`unsloth/gemma-4-E4B-it`) and apply the adapter in 4-bit. | |
| ```python | |
| # %% [Install] | |
| %%capture | |
| import os, re | |
| # Install everything needed for the T4 Colab environment | |
| !pip install sentencepiece protobuf "datasets==4.3.0" "huggingface_hub>=0.34.0" hf_transfer | |
| !pip install --no-deps unsloth_zoo bitsandbytes accelerate xformers peft trl triton unsloth | |
| !pip install --no-deps --upgrade "torchao>=0.16.0" | |
| !pip install --no-deps transformers==5.5.0 "tokenizers>=0.22.0,<=0.23.0" | |
| !pip install torchcodec | |
| import torch | |
| torch._dynamo.config.recompile_limit = 64 | |
| import warnings | |
| # Suppress the specific PyTorch size check warning from bitsandbytes | |
| warnings.filterwarnings( | |
| "ignore", | |
| category=FutureWarning, | |
| message=".*_check_is_size will be removed in a future PyTorch release.*" | |
| ) | |
| #------------ | |
| from unsloth import FastModel | |
| model, tokenizer = FastModel.from_pretrained( | |
| "Sothay/gemma4-hscode-classifier", # LoRA adapter on Hugging Face | |
| load_in_4bit = True, # required – the adapter was trained in 4-bit | |
| max_seq_length = 1024, | |
| ) | |
| # ---------- Inference with the authoritative lookup table (recommended) ---------- | |
| import json, re | |
| with open("hs_code_lookup.json") as f: | |
| rate_lookup = json.load(f) | |
| def predict_hs_code(description: str) -> dict: | |
| system_prompt = ( | |
| "You are a customs compliance AI. Classify the product description to its " | |
| "correct 8-digit HS code and output the corresponding trade rates (Customs Duty, " | |
| "Special Tax, VAT, Excise Tax) and unit." | |
| ) | |
| messages = [ | |
| {"role": "system", "content": [{"type": "text", "text": system_prompt}]}, | |
| {"role": "user", "content": [{"type": "text", "text": f"Description: {description}"}]}, | |
| ] | |
| inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda") | |
| out = model.generate(inputs, max_new_tokens=80, do_sample=False) | |
| text = tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True) | |
| m = re.search(r"HS Code:\s*([0-9]{4,10})", text) | |
| code = m.group(1) if m else None | |
| if code and code in rate_lookup: | |
| return {"hs_code": code, "source": "lookup_table", **rate_lookup[code]} | |
| return {"hs_code": code, "source": "model_only_UNVERIFIED", "raw_output": text} | |
| print(predict_hs_code("Men's cotton knitted T-shirt")) | |
| ``` | |
| --- | |
| ## 🔍 Raw model output (debugging) | |
| If you want to see exactly what the model generated (including the rates it predicted) without the lookup table, use the raw‑output function below. | |
| **Do not** use these rates in production – they are only for debugging or confidence evaluation. | |
| ```python | |
| def predict_hs_code_raw(description: str, max_new_tokens=100) -> dict: | |
| system_prompt = ( | |
| "You are a customs compliance AI. Classify the product description to its " | |
| "correct 8-digit HS code and output the corresponding trade rates (Customs Duty, " | |
| "Special Tax, VAT, Excise Tax) and unit." | |
| ) | |
| messages = [ | |
| {"role": "system", "content": [{"type": "text", "text": system_prompt}]}, | |
| {"role": "user", "content": [{"type": "text", "text": f"Description: {description}"}]}, | |
| ] | |
| inputs = tokenizer.apply_chat_template( | |
| messages, add_generation_prompt=True, tokenize=True, | |
| return_dict=True, return_tensors="pt", | |
| ).to("cuda") | |
| out = model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=True, do_sample=False) | |
| raw_text = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True) | |
| def extract(pattern, text): | |
| m = re.search(pattern, text) | |
| return m.group(1).strip() if m else None | |
| return { | |
| "hs_code": extract(r"HS Code:\s*([0-9.]+)", raw_text), | |
| "unit": extract(r"Unit:\s*(.*)", raw_text), | |
| "cd_rate": extract(r"Customs Duty:\s*([\d.]+)%?", raw_text), | |
| "st_rate": extract(r"Special Tax:\s*([\d.]+)%?", raw_text), | |
| "vat_rate": extract(r"VAT:\s*([\d.]+)%?", raw_text), | |
| "et_rate": extract(r"Excise Tax:\s*([\d.]+)%?", raw_text), | |
| "raw_output": raw_text | |
| } | |
| # Example | |
| raw = predict_hs_code_raw("Men's cotton knitted T-shirt") | |
| print(raw["raw_output"]) | |
| print(raw["hs_code"]) # model’s guess | |
| ``` | |
| --- | |
| ## 🧠 Training details | |
| - **Base model**: `unsloth/gemma-4-E4B-it` (4‑bit QLoRA) | |
| - **Adapter rank**: r=16, alpha=16, targeting all language & attention layers | |
| - **Gradient checkpointing**: Unsloth’s own implementation (avoids Gemma‑4 KV‑shared layer bug) | |
| - **Dataset**: Custom Cambodian HS‑code dataset (`hs_code.csv`) with descriptions, codes, and official rates | |
| - Cleaned, deduplicated, split into 90/10 train/validation | |
| - Chat roles fixed to system/user/assistant (Gemma‑4 standard) | |
| - **Training config**: 3 epochs, effective batch size 8, learning rate 2e‑4, linear schedule, eval & save every epoch, best model loaded | |
| - **Hardware**: Google Colab T4 (16 GB) – peak memory ~10 GB thanks to QLoRA | |
| - **Accuracy**: Evaluated on held‑out examples (exact HS‑code match) – see model card for current numbers | |
| --- | |
| ## ⚖️ Production use | |
| > **Always use the lookup table – never trust the model’s generated rates.** | |
| The model is a **classifier**: description → HS code. | |
| Rates are fetched deterministically from `hs_code_lookup.json`, a file extracted from the same official tariff data used during training. | |
| Why? | |
| - A causal LM recalling a rate from memory will occasionally hallucinate – a customs tool with confident, wrong numbers is worse than one that says “I don’t know”. | |
| - The lookup table guarantees 100% accuracy on rates once the HS code is correct. | |
| The `hs_code_lookup.json` file is included in this repository and can be downloaded via: | |
| ```python | |
| from huggingface_hub import hf_hub_download | |
| hf_hub_download("Sothay/gemma4-hscode-classifier", "hs_code_lookup.json") | |
| ``` | |
| --- | |
| ## 📦 Files in this repository | |
| | File | Description | | |
| |------|-------------| | |
| | `adapter_model.safetensors` | LoRA adapter weights (few MB) | | |
| | `adapter_config.json` | Adapter configuration (references base model) | | |
| | `tokenizer.json`, `tokenizer_config.json` | Tokenizer files | | |
| | `hs_code_lookup.json` | Authoritative rate table for production inference | | |
| | `README.md` | This file | | |
| > **Note**: Only the adapter is stored here – the full Gemma‑4 base model is automatically fetched from Unsloth when you call `FastModel.from_pretrained`. | |
| > If you need a **merged, full‑precision model** (for vLLM, TGI, etc.), generate it locally with Unsloth: | |
| > ```python | |
| > model.save_pretrained_merged("merged_fp16", tokenizer, save_method="merged_16bit") | |
| > ``` | |
| --- | |
| ## 🦙 Ollama / llama.cpp (GGUF) | |
| Export a quantized GGUF directly from the loaded adapter: | |
| ```python | |
| model.save_pretrained_gguf("gguf_model", tokenizer, quantization_method="q4_k_m") | |
| ``` | |
| Then use with Ollama (see [`Modelfile` example](https://ollama.com) – set temperature 0, deterministic sampling). | |
| --- | |
| ## 📊 Example predictions | |
| | Description | Predicted HS Code | Unit | CD | ST | VAT | ET | | |
| |-------------|-------------------|------|----|----|-----|----| | |
| | Toyota Hilux pickup, diesel 2.8L | 87042110 | UNIT | 35% | 50% | 10% | 0% | | |
| | iPhone 15 Pro Max 256GB | 85171200 | UNIT | 0% | 0% | 10% | 0% | | |
| | Heineken beer 330ml can | 22030010 | LTR | 35% | 30% | 10% | 0% | | |
| *(Rates from lookup table – not generated by the model.)* | |
| --- | |
| ## ⚠️ Limitations | |
| - The model may output incorrect HS codes for ambiguous, misspelled, or region‑specific descriptions. | |
| - It was trained on a fixed set of Cambodian HS codes; revisions after the training data cutoff are not covered. | |
| - Duty rates can become outdated – always cross‑check with the latest official tariff schedule. | |
| - The model is a classifier, **not** a legal authority. For binding decisions, consult a customs professional. | |
| --- | |
| ## 📝 License | |
| This model is a derivative of **Gemma‑4‑E4B‑it** and is subject to the [Gemma license](https://ai.google.dev/gemma/terms). | |
| The HS‑code dataset and lookup table are the property of their respective owners. | |
| --- | |
| ## 🙏 Acknowledgments | |
| - [Unsloth](https://github.com/unslothai/unsloth) – made QLoRA + Gemma‑4 on a T4 effortless | |
| - [Google DeepMind](https://deepmind.google) – for the Gemma family of models | |
| --- | |
| ## 📚 Citation | |
| If you use this model, please cite: | |
| ```bibtex | |
| @misc{gemma4-hscode-classifier, | |
| author = {Sothay}, | |
| title = {Gemma‑4 HS Code Classifier (Cambodia Customs)}, | |
| year = 2025, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/Sothay/gemma4-hscode-classifier}} | |
| } | |
| ``` | |
| --- | |
| **Author**: [Sothay](https://huggingface.co/Sothay) | |
| **Model card version**: 1.2 |