Feature Extraction
sentence-transformers
Chinese
English
routing
thinking-router
chain-of-thought
classifier
embedding
Taiwan
Instructions to use lianghsun/embeddinggemma-300m-cot-router with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use lianghsun/embeddinggemma-300m-cot-router with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("lianghsun/embeddinggemma-300m-cot-router") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Model Card for embeddinggemma-300m-cot-router
embeddinggemma-300m-cot-router 是一個基於 google/embeddinggemma-300m 微調的輕量級 thinking router:對輸入 prompt 判斷其「是否需要進入 thinking 模式才能正確回答」,輸出二元類別。在串接 thinking 與 non-thinking 模型時可作為前端決策器,節省推論成本與延遲。
⚠️ 規格重點: 本模型為 300M 參數 embedding-based classifier,不是生成模型。其功能是輸出路由決策(thinking/non-thinking),而非產生文字回應。
Model Details
近年大型語言模型支援 thinking/reasoning 模式(如 o-series、Qwen3、DeepSeek-R1 等),但 thinking 模式會大幅增加 token 用量與延遲。實務上多數 prompt(例如「幫我寫感謝信」「翻譯這段話」)並不需要 chain-of-thought;只有少數需要多步推理的問題才適合進入 thinking 模式。
embeddinggemma-300m-cot-router 是針對此問題設計的輕量級分類器:以 tw-think-router-annotation 為訓練資料微調 EmbeddingGemma 300M,在系統前端決定請求應送往 thinking 或 no-thinking 模型/模式。
核心特點 (Key Features)
- Embedding-based 路由:使用 sentence-transformers 風格的小模型做分類,推論成本遠低於使用大型 LLM 做 routing。
- 300M 級輕量:可部署於 CPU 或邊緣節點,作為閘道服務。
- 節省成本與延遲:在 thinking model 前端攔截不必要的請求,降低 token 消耗與回應時間。
Model Description
- Developed by: Liang Hsun Huang
- Funded by: APMIC
- Base model: google/embeddinggemma-300m
- Model type: Sentence-transformer based binary classifier
- Language(s) (NLP): Traditional Chinese, English
- License: MIT
- Finetuned from model: google/embeddinggemma-300m
Model Sources
- Repository: lianghsun/embeddinggemma-300m-cot-router
- Training data: lianghsun/tw-think-router-annotation
Citation
@misc{embeddinggemma_300m_cot_router,
title = {embeddinggemma-300m-cot-router: A Lightweight Thinking-mode Routing Classifier},
author = {Huang, Liang Hsun},
year = {2025},
howpublished = {\url{https://huggingface.co/lianghsun/embeddinggemma-300m-cot-router}}
}
Acknowledge
- 特此感謝 APMIC 的算力支援。
Model Card Authors
Model Card Contact
Model tree for lianghsun/embeddinggemma-300m-cot-router
Base model
google/embeddinggemma-300mDataset used to train lianghsun/embeddinggemma-300m-cot-router
Viewer • Updated • 94.4k • 16