File size: 9,902 Bytes
9988d8d
88edd90
077b045
88edd90
3fc6199
 
d9f2d17
3fc6199
 
 
 
 
 
 
 
 
9988d8d
d9f2d17
88edd90
d9f2d17
88edd90
d9f2d17
88edd90
d9f2d17
88edd90
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
 
 
 
 
 
 
 
76f76f8
88edd90
 
76f76f8
88edd90
76f76f8
88edd90
 
 
 
 
 
a670db9
 
 
 
 
 
 
 
 
 
 
 
 
d11cca5
 
 
 
 
 
 
 
 
 
a670db9
 
88edd90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76f76f8
88edd90
76f76f8
a670db9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88edd90
76f76f8
88edd90
 
 
 
 
 
 
 
 
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
 
76f76f8
88edd90
 
 
76f76f8
88edd90
76f76f8
88edd90
 
 
 
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
 
 
 
 
 
 
76f76f8
88edd90
 
 
 
 
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
 
 
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
 
a670db9
 
 
76f76f8
88edd90
76f76f8
88edd90
76f76f8
a670db9
 
 
 
 
 
 
 
 
88edd90
76f76f8
88edd90
 
76f76f8
88edd90
76f76f8
88edd90
76f76f8
88edd90
 
76f76f8
88edd90
d9f2d17
a670db9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88edd90
a670db9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
---
license: gemma
pipeline_tag: text-generation
language:
- en
- km
tags:
- customs
- hs-code
- classification
- cambodia
- gemma
- unsloth
- qlora
base_model:
- unsloth/gemma-4-E4B-it
---

# Gemma‑4 HS Code Classifier (Cambodia Customs)

A **Gemma‑4‑E4B‑it** model fine‑tuned with QLoRA to classify product descriptions into **8‑digit HS codes** and return corresponding Cambodian trade rates (Customs Duty, Special Tax, VAT, Excise Tax).

Built with **[Unsloth](https://github.com/unslothai/unsloth)** for fast, memory‑efficient fine‑tuning on a single T4 GPU.

---

## 🎯 What it does

Given a plain‑English product description, the model generates:

```text
HS Code: 61091000
Unit: PIECE
Customs Duty: 25%
Special Tax: 0%
VAT: 10%
Excise Tax: 0%
```

**⚠️ Important**: The rates in the text are generated by the model and **may be wrong**.  
For production, always use the included **lookup table** (`hs_code_lookup.json`) – see [Production use](#-production-use) below.

---

## 🚀 Quick start (in Colab or locally)

This repository contains **only the LoRA adapter**, not the full model.  
Loading it will automatically download the base model (`unsloth/gemma-4-E4B-it`) and apply the adapter in 4-bit.

```python

# %% [Install]
%%capture
import os, re
# Install everything needed for the T4 Colab environment
!pip install sentencepiece protobuf "datasets==4.3.0" "huggingface_hub>=0.34.0" hf_transfer
!pip install --no-deps unsloth_zoo bitsandbytes accelerate xformers peft trl triton unsloth
!pip install --no-deps --upgrade "torchao>=0.16.0"
!pip install --no-deps transformers==5.5.0 "tokenizers>=0.22.0,<=0.23.0"
!pip install torchcodec
import torch
torch._dynamo.config.recompile_limit = 64


import warnings

# Suppress the specific PyTorch size check warning from bitsandbytes
warnings.filterwarnings(
    "ignore", 
    category=FutureWarning, 
    message=".*_check_is_size will be removed in a future PyTorch release.*"
)

#------------

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    "Sothay/gemma4-hscode-classifier",   # LoRA adapter on Hugging Face
    load_in_4bit = True,                 # required – the adapter was trained in 4-bit
    max_seq_length = 1024,
)

# ---------- Inference with the authoritative lookup table (recommended) ----------
import json, re

with open("hs_code_lookup.json") as f:
    rate_lookup = json.load(f)

def predict_hs_code(description: str) -> dict:
    system_prompt = (
        "You are a customs compliance AI. Classify the product description to its "
        "correct 8-digit HS code and output the corresponding trade rates (Customs Duty, "
        "Special Tax, VAT, Excise Tax) and unit."
    )
    messages = [
        {"role": "system", "content": [{"type": "text", "text": system_prompt}]},
        {"role": "user",   "content": [{"type": "text", "text": f"Description: {description}"}]},
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
    out = model.generate(inputs, max_new_tokens=80, do_sample=False)
    text = tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)

    m = re.search(r"HS Code:\s*([0-9]{4,10})", text)
    code = m.group(1) if m else None
    if code and code in rate_lookup:
        return {"hs_code": code, "source": "lookup_table", **rate_lookup[code]}
    return {"hs_code": code, "source": "model_only_UNVERIFIED", "raw_output": text}

print(predict_hs_code("Men's cotton knitted T-shirt"))
```

---

## 🔍 Raw model output (debugging)

If you want to see exactly what the model generated (including the rates it predicted) without the lookup table, use the raw‑output function below.  
**Do not** use these rates in production – they are only for debugging or confidence evaluation.

```python
def predict_hs_code_raw(description: str, max_new_tokens=100) -> dict:
    system_prompt = (
        "You are a customs compliance AI. Classify the product description to its "
        "correct 8-digit HS code and output the corresponding trade rates (Customs Duty, "
        "Special Tax, VAT, Excise Tax) and unit."
    )
    messages = [
        {"role": "system", "content": [{"type": "text", "text": system_prompt}]},
        {"role": "user",   "content": [{"type": "text", "text": f"Description: {description}"}]},
    ]
    inputs = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, tokenize=True,
        return_dict=True, return_tensors="pt",
    ).to("cuda")

    out = model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=True, do_sample=False)
    raw_text = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

    def extract(pattern, text):
        m = re.search(pattern, text)
        return m.group(1).strip() if m else None

    return {
        "hs_code":   extract(r"HS Code:\s*([0-9.]+)", raw_text),
        "unit":      extract(r"Unit:\s*(.*)", raw_text),
        "cd_rate":   extract(r"Customs Duty:\s*([\d.]+)%?", raw_text),
        "st_rate":   extract(r"Special Tax:\s*([\d.]+)%?", raw_text),
        "vat_rate":  extract(r"VAT:\s*([\d.]+)%?", raw_text),
        "et_rate":   extract(r"Excise Tax:\s*([\d.]+)%?", raw_text),
        "raw_output": raw_text
    }

# Example
raw = predict_hs_code_raw("Men's cotton knitted T-shirt")
print(raw["raw_output"])
print(raw["hs_code"])   # model’s guess
```

---

## 🧠 Training details

- **Base model**: `unsloth/gemma-4-E4B-it` (4‑bit QLoRA)
- **Adapter rank**: r=16, alpha=16, targeting all language & attention layers
- **Gradient checkpointing**: Unsloth’s own implementation (avoids Gemma‑4 KV‑shared layer bug)
- **Dataset**: Custom Cambodian HS‑code dataset (`hs_code.csv`) with descriptions, codes, and official rates
  - Cleaned, deduplicated, split into 90/10 train/validation
  - Chat roles fixed to system/user/assistant (Gemma‑4 standard)
- **Training config**: 3 epochs, effective batch size 8, learning rate 2e‑4, linear schedule, eval & save every epoch, best model loaded
- **Hardware**: Google Colab T4 (16 GB) – peak memory ~10 GB thanks to QLoRA
- **Accuracy**: Evaluated on held‑out examples (exact HS‑code match) – see model card for current numbers

---

## ⚖️ Production use

> **Always use the lookup table – never trust the model’s generated rates.**

The model is a **classifier**: description → HS code.  
Rates are fetched deterministically from `hs_code_lookup.json`, a file extracted from the same official tariff data used during training.

Why?  
- A causal LM recalling a rate from memory will occasionally hallucinate – a customs tool with confident, wrong numbers is worse than one that says “I don’t know”.
- The lookup table guarantees 100% accuracy on rates once the HS code is correct.

The `hs_code_lookup.json` file is included in this repository and can be downloaded via:

```python
from huggingface_hub import hf_hub_download
hf_hub_download("Sothay/gemma4-hscode-classifier", "hs_code_lookup.json")
```

---

## 📦 Files in this repository

| File | Description |
|------|-------------|
| `adapter_model.safetensors` | LoRA adapter weights (few MB) |
| `adapter_config.json` | Adapter configuration (references base model) |
| `tokenizer.json`, `tokenizer_config.json` | Tokenizer files |
| `hs_code_lookup.json` | Authoritative rate table for production inference |
| `README.md` | This file |

> **Note**: Only the adapter is stored here – the full Gemma‑4 base model is automatically fetched from Unsloth when you call `FastModel.from_pretrained`.  
> If you need a **merged, full‑precision model** (for vLLM, TGI, etc.), generate it locally with Unsloth:
> ```python
> model.save_pretrained_merged("merged_fp16", tokenizer, save_method="merged_16bit")
> ```

---

## 🦙 Ollama / llama.cpp (GGUF)

Export a quantized GGUF directly from the loaded adapter:

```python
model.save_pretrained_gguf("gguf_model", tokenizer, quantization_method="q4_k_m")
```

Then use with Ollama (see [`Modelfile` example](https://ollama.com) – set temperature 0, deterministic sampling).

---

## 📊 Example predictions

| Description | Predicted HS Code | Unit | CD | ST | VAT | ET |
|-------------|-------------------|------|----|----|-----|----|
| Toyota Hilux pickup, diesel 2.8L | 87042110 | UNIT | 35% | 50% | 10% | 0% |
| iPhone 15 Pro Max 256GB | 85171200 | UNIT | 0% | 0% | 10% | 0% |
| Heineken beer 330ml can | 22030010 | LTR | 35% | 30% | 10% | 0% |

*(Rates from lookup table – not generated by the model.)*

---

## ⚠️ Limitations

- The model may output incorrect HS codes for ambiguous, misspelled, or region‑specific descriptions.
- It was trained on a fixed set of Cambodian HS codes; revisions after the training data cutoff are not covered.
- Duty rates can become outdated – always cross‑check with the latest official tariff schedule.
- The model is a classifier, **not** a legal authority. For binding decisions, consult a customs professional.

---

## 📝 License

This model is a derivative of **Gemma‑4‑E4B‑it** and is subject to the [Gemma license](https://ai.google.dev/gemma/terms).  
The HS‑code dataset and lookup table are the property of their respective owners.

---

## 🙏 Acknowledgments

- [Unsloth](https://github.com/unslothai/unsloth) – made QLoRA + Gemma‑4 on a T4 effortless
- [Google DeepMind](https://deepmind.google) – for the Gemma family of models

---

## 📚 Citation

If you use this model, please cite:

```bibtex
@misc{gemma4-hscode-classifier,
  author = {Sothay},
  title = {Gemma‑4 HS Code Classifier (Cambodia Customs)},
  year = 2025,
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Sothay/gemma4-hscode-classifier}}
}
```

---

**Author**: [Sothay](https://huggingface.co/Sothay)  
**Model card version**: 1.2