| | --- |
| | base_model: google/functiongemma-270m-it |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | license: gemma |
| | tags: |
| | - intercomswap |
| | - function-calling |
| | - tool-calling |
| | - lightning |
| | - solana |
| | - gemma |
| | --- |
| | |
| | # functiongemma-270m-it-intercomswap-v3 |
| |
|
| | IntercomSwap fine-tuned FunctionGemma model for deterministic tool-calling in BTC Lightning <-> USDT Solana swap workflows. |
| |
|
| | ## What Is IntercomSwap |
| |
|
| | Intercom Swap is a fork of upstream Intercom that keeps the Intercom stack intact and adds a non-custodial swap harness for BTC over Lightning <> USDT on Solana via a shared escrow program, with deterministic operator tooling, recovery, and unattended end-to-end tests. |
| |
|
| | GitHub: https://github.com/TracSystems/intercom-swap |
| |
|
| | Base model: [google/functiongemma-270m-it](https://huggingface.co/google/functiongemma-270m-it) |
| |
|
| | ## Model Purpose |
| |
|
| | - Convert natural-language operator prompts into validated tool calls. |
| | - Enforce buy/sell direction mapping for swap intents. |
| | - Support repeat/autopost workflows used by IntercomSwap prompt routing. |
| |
|
| | ## Repository Layout |
| |
|
| | - `./`: |
| | - merged HF checkpoint (Transformers format) |
| | - `./nvfp4`: |
| | - NVFP4-quantized checkpoint for TensorRT-LLM serving |
| | - `./gguf`: |
| | - `functiongemma-v3-f16.gguf` |
| | - `functiongemma-v3-q8_0.gguf` |
| |
|
| | ## Startup By Flavor |
| |
|
| | ### 1) Base HF checkpoint (Transformers) |
| |
|
| | ```bash |
| | python -m vllm.entrypoints.openai.api_server \ |
| | --model TracNetwork/functiongemma-270m-it-intercomswap-v3 \ |
| | --host 0.0.0.0 \ |
| | --port 8000 \ |
| | --dtype auto \ |
| | --max-model-len 8192 |
| | ``` |
| |
|
| | Lower memory mode example: |
| |
|
| | ```bash |
| | python -m vllm.entrypoints.openai.api_server \ |
| | --model TracNetwork/functiongemma-270m-it-intercomswap-v3 \ |
| | --host 0.0.0.0 \ |
| | --port 8000 \ |
| | --dtype auto \ |
| | --max-model-len 4096 \ |
| | --max-num-seqs 8 |
| | ``` |
| |
|
| | ### 2) NVFP4 checkpoint (`./nvfp4`) |
| |
|
| | TensorRT-LLM example with explicit headroom (avoid consuming all VRAM): |
| |
|
| | ```bash |
| | trtllm-serve serve ./nvfp4 \ |
| | --backend pytorch \ |
| | --host 0.0.0.0 \ |
| | --port 8012 \ |
| | --max_batch_size 8 \ |
| | --max_num_tokens 16384 \ |
| | --kv_cache_free_gpu_memory_fraction 0.05 |
| | ``` |
| |
|
| | Memory tuning guidance: |
| |
|
| | - Decrease `--max_num_tokens` first. |
| | - Then reduce `--max_batch_size`. |
| | - Keep `--kv_cache_free_gpu_memory_fraction` around `0.05` to preserve safety headroom. |
| |
|
| | ### 3) GGUF checkpoint (`./gguf`) |
| |
|
| | Q8_0 (recommended default balance): |
| | |
| | ```bash |
| | llama-server \ |
| | -m ./gguf/functiongemma-v3-q8_0.gguf \ |
| | --host 0.0.0.0 \ |
| | --port 8014 \ |
| | --ctx-size 8192 \ |
| | --batch-size 256 \ |
| | --ubatch-size 64 \ |
| | --gpu-layers 12 |
| | ``` |
| | |
| | F16 (higher quality, higher memory): |
| | |
| | ```bash |
| | llama-server \ |
| | -m ./gguf/functiongemma-v3-f16.gguf \ |
| | --host 0.0.0.0 \ |
| | --port 8014 \ |
| | --ctx-size 8192 \ |
| | --batch-size 256 \ |
| | --ubatch-size 64 \ |
| | --gpu-layers 12 |
| | ``` |
| | |
| | Memory tuning guidance: |
| | |
| | - Lower `--gpu-layers` to reduce VRAM usage. |
| | - Lower `--ctx-size` to reduce RAM+VRAM KV-cache usage. |
| | - Use `q8_0` for general deployment, `f16` for quality-first offline tests. |
| | |
| | ## Training Snapshot |
| | |
| | - Base family: FunctionGemma 270M instruction-tuned. |
| | - Fine-tune objective: IntercomSwap tool-call routing and argument shaping. |
| | - Corpus profile: operations + intent-routing + tool-calling examples. |
| | |
| | ## Evaluation Snapshot |
| | |
| | From held-out evaluation for this release line: |
| | |
| | - Train examples: `6263` |
| | - Eval examples: `755` |
| | - Train loss: `0.01348` |
| | - Eval loss: `0.02012` |
| | |
| | ## Intended Use |
| | |
| | - Local or private deployments where tool execution is validated server-side. |
| | - Deterministic operator workflows for swap infra. |
| | |
| | ## Out-of-Scope Use |
| | |
| | - Autonomous financial decision-making. |
| | - Direct execution of unvalidated user text as shell/actions. |
| | - Safety-critical usage without host-side policy/validation. |
| | |
| | ## Safety Notes |
| | |
| | - Always validate tool name + argument schema server-side. |
| | - Treat network-side payloads as untrusted input. |
| | - Keep wallet secrets and API credentials outside model context. |
| | |
| | ## Provenance |
| | |
| | - Derived from: `google/functiongemma-270m-it` |
| | - Integration target: IntercomSwap prompt-mode tool routing |
| | |