intent-router-onnx / README.md
qiansc's picture
fix: update README with standard onnx filenames
76b2202 verified
---
license: mit
language:
- en
- zh
tags:
- onnx
- text-classification
- intent-routing
- code-intelligence
library_name: onnxruntime
pipeline_tag: text-classification
---
# Intent Router (ONNX int8)
A 7-class intent classifier for code query routing. Classifies natural language queries (English and Chinese) into structured intents for code intelligence tools.
## Intents
| Label | Description |
|-------|-------------|
| `locate_symbol` | Find symbol definitions |
| `find_references` | Trace reverse references / impact |
| `trace_dependencies` | Trace forward dependencies / call chains |
| `semantic_search` | Semantic search over code and docs |
| `browse_structure` | Browse package / module structure |
| `cross_layer_trace` | Map between code and business docs |
| `ambiguous` | Query cannot be classified |
## Files
| File | Required | Description |
|------|----------|-------------|
| `onnx/model.onnx` | Yes | ONNX model graph |
| `onnx/model.onnx_data` | Yes | Model weights (int8 quantized) |
| `model_head.json` | Yes | Classification head (weights + bias) |
| `tokenizer.json` | Yes | Tokenizer |
| `tokenizer_config.json` | Yes | Tokenizer configuration |
| `labels.json` | Yes | Intent label list |
| `config.json` | Yes | Model configuration |
## Inference
Requires [ONNX Runtime](https://onnxruntime.ai/). The model takes tokenized text input and outputs sentence embeddings. The classification head (`model_head.json`) maps embeddings to intent logits.
```
input text β†’ tokenizer β†’ ONNX model β†’ embedding β†’ classification head β†’ intent + confidence
```
## Benchmark
Evaluated on a held-out test set of 221 bilingual (Chinese + English) code queries.
| Metric | Value |
|--------|-------|
| Overall accuracy | 96.8% (214/221) |
| Inference latency (CPU, ONNX Runtime) | ~3ms p50 |
Per-intent performance:
| Intent | Precision | Recall | F1 |
|--------|-----------|--------|----|
| locate_symbol | 98.4% | 96.8% | 0.976 |
| find_references | 95.1% | 97.5% | 0.963 |
| trace_dependencies | 90.2% | 95.1% | 0.926 |
| semantic_search | 100.0% | 91.5% | 0.956 |
| browse_structure | 91.7% | 100.0% | 0.957 |
| cross_layer_trace | 100.0% | 100.0% | 1.000 |
| ambiguous | 100.0% | 100.0% | 1.000 |
Training set: 284 samples. Test set: 221 samples.
## Quantization
int8 (dynamic quantization). Total size ~559MB.
## Related Project
This model is fine-tuned for [C4A (Context For AI)](https://github.com/context4ai/c4a), a knowledge modeling service that indexes code repositories and business documents for developer teams and AI agents.
## License
MIT