Text Generation
PEFT
Safetensors
GGUF
English
Hindi
lora
qlora
transaction-parser
on-device
conversational
Instructions to use kartikey31/txn-parser with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use kartikey31/txn-parser with PEFT:
Task type is invalid.
- llama-cpp-python
How to use kartikey31/txn-parser with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="kartikey31/txn-parser", filename="gemma-3-270m/gguf/txn-parser-gemma-3-270m-F16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use kartikey31/txn-parser with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf kartikey31/txn-parser:Q4_K_M # Run inference directly in the terminal: llama-cli -hf kartikey31/txn-parser:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf kartikey31/txn-parser:Q4_K_M # Run inference directly in the terminal: llama-cli -hf kartikey31/txn-parser:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf kartikey31/txn-parser:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf kartikey31/txn-parser:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf kartikey31/txn-parser:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf kartikey31/txn-parser:Q4_K_M
Use Docker
docker model run hf.co/kartikey31/txn-parser:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use kartikey31/txn-parser with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kartikey31/txn-parser" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kartikey31/txn-parser", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/kartikey31/txn-parser:Q4_K_M
- Ollama
How to use kartikey31/txn-parser with Ollama:
ollama run hf.co/kartikey31/txn-parser:Q4_K_M
- Unsloth Studio
How to use kartikey31/txn-parser with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for kartikey31/txn-parser to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for kartikey31/txn-parser to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for kartikey31/txn-parser to start chatting
- Docker Model Runner
How to use kartikey31/txn-parser with Docker Model Runner:
docker model run hf.co/kartikey31/txn-parser:Q4_K_M
- Lemonade
How to use kartikey31/txn-parser with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull kartikey31/txn-parser:Q4_K_M
Run and chat with the model
lemonade run user.txn-parser-Q4_K_M
List all available models
lemonade list
Upload gemma-3-270m/README.md with huggingface_hub
Browse files- gemma-3-270m/README.md +71 -5
gemma-3-270m/README.md
CHANGED
|
@@ -55,8 +55,37 @@ sibling fine-tunes of other base models trained on the same data.
|
|
| 55 |
| Eval batch size | 4 |
|
| 56 |
| Max seq length | 1024 |
|
| 57 |
| Learning rate | 2e-4 (warmup 3%) |
|
| 58 |
-
| Started |
|
| 59 |
-
| Finished |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
## Download a single GGUF
|
| 62 |
|
|
@@ -66,12 +95,49 @@ huggingface-cli download kartikey31/txn-parser \
|
|
| 66 |
--local-dir .
|
| 67 |
```
|
| 68 |
|
| 69 |
-
## Inference (
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
```bash
|
| 72 |
./llama-cli -m txn-parser-gemma-3-270m-Q4_K_M.gguf \
|
| 73 |
--grammar-file scripts/grammar.gbnf \
|
| 74 |
-
-
|
|
|
|
| 75 |
```
|
| 76 |
|
| 77 |
## Reproduce
|
|
@@ -83,4 +149,4 @@ python scripts/train_and_publish.py --only gemma-3-270m
|
|
| 83 |
```
|
| 84 |
|
| 85 |
---
|
| 86 |
-
*Auto-published by `scripts/train_and_publish.py` on 2026-05-
|
|
|
|
| 55 |
| Eval batch size | 4 |
|
| 56 |
| Max seq length | 1024 |
|
| 57 |
| Learning rate | 2e-4 (warmup 3%) |
|
| 58 |
+
| Started | |
|
| 59 |
+
| Finished | |
|
| 60 |
+
|
| 61 |
+
## System prompt (use this EXACTLY)
|
| 62 |
+
|
| 63 |
+
The model was trained with one specific system prompt and Gemma/Smol/Qwen
|
| 64 |
+
chat template. If you paraphrase the prompt or skip the chat template,
|
| 65 |
+
quality degrades quickly. Copy-paste this verbatim into your inference
|
| 66 |
+
client (no leading/trailing whitespace, no edits):
|
| 67 |
+
|
| 68 |
+
```text
|
| 69 |
+
You convert voice-transcribed transaction descriptions into structured JSON.
|
| 70 |
+
|
| 71 |
+
Output ONLY a JSON object with this schema, no other text:
|
| 72 |
+
{"transactions":[{"amount":<number>,"currency":"INR"|"USD","item":"<lowercase singular noun phrase>","category":"<enum>","type":"expense"|"income"}]}
|
| 73 |
+
|
| 74 |
+
Categories: Food, Drinks, Groceries, Transport, Shopping, Entertainment, Bills, Health, Education, Personal, Gifts, Income, Other.
|
| 75 |
+
|
| 76 |
+
Rules:
|
| 77 |
+
- Currency defaults to INR. Use USD only when the input explicitly says "dollars" or contains "$".
|
| 78 |
+
- Amounts: "k" = Γ1000, "hazaar" = Γ1000, "sau" = Γ100, "lakh" = Γ100000. Convert number-words ("five hundred") to digits.
|
| 79 |
+
- type is "expense" by default; "income" only for explicit salary, cashback, refund, gift received, payment received.
|
| 80 |
+
- For disfluencies and corrections ("500 wait no 600"), output the CORRECTED amount only.
|
| 81 |
+
- For ambiguous items ("that thing", "stuff"), use item "unspecified" and category "Other".
|
| 82 |
+
- Item field: lowercase singular noun phrase ("uber ride", "beer", "chai" β not "Beers" or "Uber").
|
| 83 |
+
- Multi-transaction inputs become multiple array entries in spoken order.
|
| 84 |
+
- Category heuristics: uber/ola/auto/petrol/bus/metro β Transport; beer/wine/chai/coffee/juice β Drinks; rent/electricity/wifi/recharge/gas β Bills; movie/netflix/concert β Entertainment; doctor/medicine/hospital β Health.
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
Source of truth: [`scripts/_lib.py`](https://github.com/kartikeychoudhary/txn-parser/blob/main/scripts/_lib.py)
|
| 88 |
+
constant `SYSTEM_PROMPT`. Don't retype it β pull from `_lib.py` or this README.
|
| 89 |
|
| 90 |
## Download a single GGUF
|
| 91 |
|
|
|
|
| 95 |
--local-dir .
|
| 96 |
```
|
| 97 |
|
| 98 |
+
## Inference (Python, llama-cpp-python)
|
| 99 |
+
|
| 100 |
+
```python
|
| 101 |
+
from llama_cpp import Llama
|
| 102 |
+
|
| 103 |
+
SYSTEM_PROMPT = '''You convert voice-transcribed transaction descriptions into structured JSON.
|
| 104 |
+
|
| 105 |
+
Output ONLY a JSON object with this schema, no other text:
|
| 106 |
+
{"transactions":[{"amount":<number>,"currency":"INR"|"USD","item":"<lowercase singular noun phrase>","category":"<enum>","type":"expense"|"income"}]}
|
| 107 |
+
|
| 108 |
+
Categories: Food, Drinks, Groceries, Transport, Shopping, Entertainment, Bills, Health, Education, Personal, Gifts, Income, Other.
|
| 109 |
+
|
| 110 |
+
Rules:
|
| 111 |
+
- Currency defaults to INR. Use USD only when the input explicitly says "dollars" or contains "$".
|
| 112 |
+
- Amounts: "k" = Γ1000, "hazaar" = Γ1000, "sau" = Γ100, "lakh" = Γ100000. Convert number-words ("five hundred") to digits.
|
| 113 |
+
- type is "expense" by default; "income" only for explicit salary, cashback, refund, gift received, payment received.
|
| 114 |
+
- For disfluencies and corrections ("500 wait no 600"), output the CORRECTED amount only.
|
| 115 |
+
- For ambiguous items ("that thing", "stuff"), use item "unspecified" and category "Other".
|
| 116 |
+
- Item field: lowercase singular noun phrase ("uber ride", "beer", "chai" β not "Beers" or "Uber").
|
| 117 |
+
- Multi-transaction inputs become multiple array entries in spoken order.
|
| 118 |
+
- Category heuristics: uber/ola/auto/petrol/bus/metro β Transport; beer/wine/chai/coffee/juice β Drinks; rent/electricity/wifi/recharge/gas β Bills; movie/netflix/concert β Entertainment; doctor/medicine/hospital β Health.'''
|
| 119 |
+
|
| 120 |
+
llm = Llama(
|
| 121 |
+
model_path="txn-parser-gemma-3-270m-Q4_K_M.gguf",
|
| 122 |
+
n_gpu_layers=-1, n_ctx=2048,
|
| 123 |
+
)
|
| 124 |
+
out = llm.create_chat_completion(
|
| 125 |
+
messages=[
|
| 126 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
| 127 |
+
{"role": "user", "content": "200 ka samosa"},
|
| 128 |
+
],
|
| 129 |
+
temperature=0.0,
|
| 130 |
+
)
|
| 131 |
+
print(out["choices"][0]["message"]["content"])
|
| 132 |
+
```
|
| 133 |
+
|
| 134 |
+
## Inference (CLI, llama.cpp)
|
| 135 |
|
| 136 |
```bash
|
| 137 |
./llama-cli -m txn-parser-gemma-3-270m-Q4_K_M.gguf \
|
| 138 |
--grammar-file scripts/grammar.gbnf \
|
| 139 |
+
--system-prompt "$(cat system_prompt.txt)" \
|
| 140 |
+
-p "200 ka samosa" -n 256
|
| 141 |
```
|
| 142 |
|
| 143 |
## Reproduce
|
|
|
|
| 149 |
```
|
| 150 |
|
| 151 |
---
|
| 152 |
+
*Auto-published by `scripts/train_and_publish.py` on 2026-05-22T21:33:31.274220+00:00.*
|