File size: 3,973 Bytes
b4ff45c 9cc7ef3 b4ff45c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | ---
base_model: google/functiongemma-270m-it
library_name: transformers
pipeline_tag: text-generation
license: gemma
tags:
- intercomswap
- function-calling
- tool-calling
- lightning
- solana
- gemma
---
# functiongemma-270m-it-intercomswap-v3
IntercomSwap fine-tuned FunctionGemma model for deterministic tool-calling in BTC Lightning <-> USDT Solana swap workflows.
## What Is IntercomSwap
Intercom Swap is a fork of upstream Intercom that keeps the Intercom stack intact and adds a non-custodial swap harness for BTC over Lightning <> USDT on Solana via a shared escrow program, with deterministic operator tooling, recovery, and unattended end-to-end tests.
GitHub: https://github.com/TracSystems/intercom-swap
Base model: [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it)
## Model Purpose
- Convert natural-language operator prompts into validated tool calls.
- Enforce buy/sell direction mapping for swap intents.
- Support repeat/autopost workflows used by IntercomSwap prompt routing.
## Repository Layout
- `./`:
- merged HF checkpoint (Transformers format)
- `./nvfp4`:
- NVFP4-quantized checkpoint for TensorRT-LLM serving
- `./gguf`:
- `functiongemma-v3-f16.gguf`
- `functiongemma-v3-q8_0.gguf`
## Startup By Flavor
### 1) Base HF checkpoint (Transformers)
```bash
python -m vllm.entrypoints.openai.api_server \
--model TracNetwork/functiongemma-270m-it-intercomswap-v3 \
--host 0.0.0.0 \
--port 8000 \
--dtype auto \
--max-model-len 8192
```
Lower memory mode example:
```bash
python -m vllm.entrypoints.openai.api_server \
--model TracNetwork/functiongemma-270m-it-intercomswap-v3 \
--host 0.0.0.0 \
--port 8000 \
--dtype auto \
--max-model-len 4096 \
--max-num-seqs 8
```
### 2) NVFP4 checkpoint (`./nvfp4`)
TensorRT-LLM example with explicit headroom (avoid consuming all VRAM):
```bash
trtllm-serve serve ./nvfp4 \
--backend pytorch \
--host 0.0.0.0 \
--port 8012 \
--max_batch_size 8 \
--max_num_tokens 16384 \
--kv_cache_free_gpu_memory_fraction 0.05
```
Memory tuning guidance:
- Decrease `--max_num_tokens` first.
- Then reduce `--max_batch_size`.
- Keep `--kv_cache_free_gpu_memory_fraction` around `0.05` to preserve safety headroom.
### 3) GGUF checkpoint (`./gguf`)
Q8_0 (recommended default balance):
```bash
llama-server \
-m ./gguf/functiongemma-v3-q8_0.gguf \
--host 0.0.0.0 \
--port 8014 \
--ctx-size 8192 \
--batch-size 256 \
--ubatch-size 64 \
--gpu-layers 12
```
F16 (higher quality, higher memory):
```bash
llama-server \
-m ./gguf/functiongemma-v3-f16.gguf \
--host 0.0.0.0 \
--port 8014 \
--ctx-size 8192 \
--batch-size 256 \
--ubatch-size 64 \
--gpu-layers 12
```
Memory tuning guidance:
- Lower `--gpu-layers` to reduce VRAM usage.
- Lower `--ctx-size` to reduce RAM+VRAM KV-cache usage.
- Use `q8_0` for general deployment, `f16` for quality-first offline tests.
## Training Snapshot
- Base family: FunctionGemma 270M instruction-tuned.
- Fine-tune objective: IntercomSwap tool-call routing and argument shaping.
- Corpus profile: operations + intent-routing + tool-calling examples.
## Evaluation Snapshot
From held-out evaluation for this release line:
- Train examples: `6263`
- Eval examples: `755`
- Train loss: `0.01348`
- Eval loss: `0.02012`
## Intended Use
- Local or private deployments where tool execution is validated server-side.
- Deterministic operator workflows for swap infra.
## Out-of-Scope Use
- Autonomous financial decision-making.
- Direct execution of unvalidated user text as shell/actions.
- Safety-critical usage without host-side policy/validation.
## Safety Notes
- Always validate tool name + argument schema server-side.
- Treat network-side payloads as untrusted input.
- Keep wallet secrets and API credentials outside model context.
## Provenance
- Derived from: `google/functiongemma-270m-it`
- Integration target: IntercomSwap prompt-mode tool routing
|