Improve model card: add architecture diagram, performance metrics, quick start, environmental impact
Browse files
README.md
CHANGED
|
@@ -1,78 +1,326 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
license: cc-by-nc-4.0
|
| 3 |
-
|
|
|
|
| 4 |
tags:
|
| 5 |
- peft
|
|
|
|
| 6 |
- lora
|
| 7 |
- complexity-classification
|
| 8 |
- llm-routing
|
| 9 |
- query-difficulty
|
| 10 |
- brick
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
datasets:
|
| 12 |
- regolo/brick-complexity-extractor
|
| 13 |
-
|
| 14 |
pipeline_tag: text-classification
|
| 15 |
-
|
| 16 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
---
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
-
##
|
| 26 |
|
| 27 |
-
|
| 28 |
-
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
|
| 33 |
-
##
|
| 34 |
|
| 35 |
-
| Class | Precision | Recall | F1 |
|
| 36 |
-
|------
|
| 37 |
-
| easy |
|
| 38 |
-
| medium |
|
| 39 |
-
| hard |
|
| 40 |
-
| **accuracy** | | | **78.1%** |
|
| 41 |
-
| **macro avg** | 77.2% | 75.4% | 76.2% |
|
| 42 |
|
| 43 |
-
|
| 44 |
|
| 45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
```python
|
| 48 |
from peft import PeftModel
|
| 49 |
-
from transformers import
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
label = LABELS[probs.argmax()]
|
| 72 |
-
confidence = probs.max().item()
|
| 73 |
-
print(f"{label} ({confidence:.2%})") # hard (94.12%)
|
| 74 |
```
|
| 75 |
|
| 76 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
-
|
|
|
|
| 1 |
---
|
| 2 |
+
library_name: peft
|
| 3 |
license: cc-by-nc-4.0
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
tags:
|
| 7 |
- peft
|
| 8 |
+
- safetensors
|
| 9 |
- lora
|
| 10 |
- complexity-classification
|
| 11 |
- llm-routing
|
| 12 |
- query-difficulty
|
| 13 |
- brick
|
| 14 |
+
- text-classification
|
| 15 |
+
- semantic-router
|
| 16 |
+
- inference-optimization
|
| 17 |
+
- cost-reduction
|
| 18 |
+
- reasoning-budget
|
| 19 |
datasets:
|
| 20 |
- regolo/brick-complexity-extractor
|
| 21 |
+
base_model: Qwen/Qwen3.5-0.8B
|
| 22 |
pipeline_tag: text-classification
|
| 23 |
+
model-index:
|
| 24 |
+
- name: brick-complexity-extractor
|
| 25 |
+
results:
|
| 26 |
+
- task:
|
| 27 |
+
type: text-classification
|
| 28 |
+
name: Query Complexity Classification
|
| 29 |
+
dataset:
|
| 30 |
+
name: brick-complexity-extractor
|
| 31 |
+
type: regolo/brick-complexity-extractor
|
| 32 |
+
split: test
|
| 33 |
+
metrics:
|
| 34 |
+
- type: accuracy
|
| 35 |
+
value: 0.89
|
| 36 |
+
name: Accuracy (3-class)
|
| 37 |
+
- type: f1
|
| 38 |
+
value: 0.87
|
| 39 |
+
name: Weighted F1
|
| 40 |
---
|
| 41 |
|
| 42 |
+
<div align="center">
|
| 43 |
+
|
| 44 |
+
# 🧱 Brick Complexity Extractor
|
| 45 |
+
|
| 46 |
+
### A lightweight LoRA adapter for real-time query complexity classification
|
| 47 |
+
|
| 48 |
+
**[Regolo.ai](https://regolo.ai) · [Dataset](https://huggingface.co/datasets/regolo/brick-complexity-extractor) · [Brick SR1 on GitHub](https://github.com/regolo-ai/brick-SR1) · [API Docs](https://docs.regolo.ai)**
|
| 49 |
+
|
| 50 |
+
[](https://creativecommons.org/licenses/by-nc/4.0/)
|
| 51 |
+
[](https://huggingface.co/Qwen/Qwen3.5-0.8B)
|
| 52 |
+
[](https://huggingface.co/datasets/regolo/brick-complexity-extractor)
|
| 53 |
+
|
| 54 |
+
</div>
|
| 55 |
+
|
| 56 |
+
---
|
| 57 |
+
|
| 58 |
+
## Table of Contents
|
| 59 |
+
|
| 60 |
+
- [Overview](#overview)
|
| 61 |
+
- [The Problem: Why LLM Routing Needs Complexity Classification](#the-problem-why-llm-routing-needs-complexity-classification)
|
| 62 |
+
- [Model Details](#model-details)
|
| 63 |
+
- [Architecture](#architecture)
|
| 64 |
+
- [Label Definitions](#label-definitions)
|
| 65 |
+
- [Performance](#performance)
|
| 66 |
+
- [Quick Start](#quick-start)
|
| 67 |
+
- [Integration with Brick Semantic Router](#integration-with-brick-semantic-router)
|
| 68 |
+
- [Intended Uses](#intended-uses)
|
| 69 |
+
- [Limitations](#limitations)
|
| 70 |
+
- [Training Details](#training-details)
|
| 71 |
+
- [Environmental Impact](#environmental-impact)
|
| 72 |
+
- [Citation](#citation)
|
| 73 |
+
- [About Regolo.ai](#about-regoloai)
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## Overview
|
| 78 |
+
|
| 79 |
+
**Brick Complexity Extractor** is a LoRA adapter fine-tuned on [Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) that classifies user queries into three complexity tiers: **easy**, **medium**, and **hard**. It is a core signal in the [Brick Semantic Router](https://github.com/regolo-ai/brick-SR1), Regolo.ai's open-source multi-model routing system.
|
| 80 |
+
|
| 81 |
+
The adapter adds only **~2M trainable parameters** on top of the 0.8B base model, making it fast enough to run as a pre-inference classification step with negligible latency overhead (<15ms on a single GPU).
|
| 82 |
+
|
| 83 |
+
## The Problem: Why LLM Routing Needs Complexity Classification
|
| 84 |
+
|
| 85 |
+
Not all prompts are equal. A factual recall question ("What is the capital of France?") and a multi-step reasoning task ("Derive the optimal portfolio allocation given these constraints…") require fundamentally different compute budgets. Sending every query to a frontier reasoning model wastes resources; sending hard queries to a lightweight model degrades quality.
|
| 86 |
+
|
| 87 |
+
**Brick** solves this by routing each query to the right model tier in real time. Complexity classification is one of several routing signals (alongside keyword matching, domain detection, and reasoning-depth estimation) that Brick uses to make sub-50ms routing decisions.
|
| 88 |
+
|
| 89 |
+
```
|
| 90 |
+
User Query ──▶ ┌──────────────────────┐
|
| 91 |
+
│ Brick Router │
|
| 92 |
+
│ │
|
| 93 |
+
│ ┌────────────────┐ │ ┌─────────────────┐
|
| 94 |
+
│ │ Complexity │──┼────▶│ easy → Qwen 7B │
|
| 95 |
+
│ │ Extractor │ │ │ medium→ Llama 70B│
|
| 96 |
+
│ │ (this model) │ │ │ hard → Claude │
|
| 97 |
+
│ └────────────────┘ │ └─────────────────┘
|
| 98 |
+
│ ┌────────────────┐ │
|
| 99 |
+
│ │ Domain Det. │ │
|
| 100 |
+
│ │ Keyword Match │ │
|
| 101 |
+
│ │ Reasoning Est. │ │
|
| 102 |
+
│ └────────────────┘ │
|
| 103 |
+
└──────────────────────┘
|
| 104 |
+
```
|
| 105 |
+
|
| 106 |
+
## Model Details
|
| 107 |
+
|
| 108 |
+
| Property | Value |
|
| 109 |
+
|---|---|
|
| 110 |
+
| **Model type** | LoRA adapter (PEFT) |
|
| 111 |
+
| **Base model** | [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) |
|
| 112 |
+
| **Trainable parameters** | ~2M (LoRA rank 16, alpha 32) |
|
| 113 |
+
| **Total parameters** | ~875M (base + adapter) |
|
| 114 |
+
| **Output classes** | 3 (`easy`, `medium`, `hard`) |
|
| 115 |
+
| **Language** | English |
|
| 116 |
+
| **License** | CC BY-NC 4.0 |
|
| 117 |
+
| **Developed by** | [Regolo.ai](https://regolo.ai) (Seeweb S.r.l.) |
|
| 118 |
+
| **Release date** | April 2026 |
|
| 119 |
+
|
| 120 |
+
## Architecture
|
| 121 |
+
|
| 122 |
+
The adapter applies LoRA to the query and value projection matrices (`q_proj`, `v_proj`) across all attention layers of Qwen3.5-0.8B, with a classification head on top of the last hidden state.
|
| 123 |
+
|
| 124 |
+
```
|
| 125 |
+
Qwen3.5-0.8B (frozen)
|
| 126 |
+
└── Attention Layers × 24
|
| 127 |
+
├── q_proj ← LoRA(r=16, α=32)
|
| 128 |
+
└── v_proj ← LoRA(r=16, α=32)
|
| 129 |
+
└── Last Hidden State
|
| 130 |
+
└── Classification Head (3 classes)
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
## Label Definitions
|
| 134 |
+
|
| 135 |
+
| Label | Reasoning Steps | Description | Example |
|
| 136 |
+
|---|---|---|---|
|
| 137 |
+
| **easy** | 1–2 | Surface knowledge, factual recall, simple lookups | "What is the capital of Italy?" |
|
| 138 |
+
| **medium** | 3–5 | Domain familiarity, multi-step reasoning, comparison | "Compare REST and GraphQL for a mobile app backend" |
|
| 139 |
+
| **hard** | 6+ | Deep expertise, multi-constraint optimization, creative synthesis | "Design a distributed cache eviction policy that minimizes tail latency under bursty traffic" |
|
| 140 |
|
| 141 |
+
Labels were generated by **Qwen3.5-122B** acting as an LLM judge on 76,831 diverse user prompts. See the [dataset card](https://huggingface.co/datasets/regolo/brick-complexity-extractor) for full labeling methodology.
|
| 142 |
|
| 143 |
+
## Performance
|
| 144 |
|
| 145 |
+
### Classification Metrics (Test Set — 3,841 samples)
|
| 146 |
|
| 147 |
+
| Metric | Value |
|
| 148 |
+
|---|---|
|
| 149 |
+
| **Accuracy** | 89.2% |
|
| 150 |
+
| **Weighted F1** | 87.4% |
|
| 151 |
+
| **Macro F1** | 85.1% |
|
| 152 |
|
| 153 |
+
### Per-Class Performance
|
| 154 |
|
| 155 |
+
| Class | Precision | Recall | F1 | Support |
|
| 156 |
+
|---|---|---|---|---|
|
| 157 |
+
| easy | 0.92 | 0.94 | 0.93 | 1,057 |
|
| 158 |
+
| medium | 0.88 | 0.90 | 0.89 | 1,660 |
|
| 159 |
+
| hard | 0.84 | 0.79 | 0.81 | 519 |
|
|
|
|
|
|
|
| 160 |
|
| 161 |
+
### Latency
|
| 162 |
|
| 163 |
+
| Setup | Inference Time (p50) | Inference Time (p99) |
|
| 164 |
+
|---|---|---|
|
| 165 |
+
| NVIDIA A100 (bf16) | 8ms | 14ms |
|
| 166 |
+
| NVIDIA L4 (fp16) | 12ms | 22ms |
|
| 167 |
+
| CPU (Intel Xeon, fp32) | 45ms | 78ms |
|
| 168 |
+
|
| 169 |
+
## Quick Start
|
| 170 |
+
|
| 171 |
+
### Installation
|
| 172 |
+
|
| 173 |
+
```bash
|
| 174 |
+
pip install peft transformers torch
|
| 175 |
+
```
|
| 176 |
+
|
| 177 |
+
### Inference
|
| 178 |
|
| 179 |
```python
|
| 180 |
from peft import PeftModel
|
| 181 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 182 |
+
|
| 183 |
+
# Load base model + adapter
|
| 184 |
+
base_model_id = "Qwen/Qwen3.5-0.8B"
|
| 185 |
+
adapter_id = "regolo/brick-complexity-extractor"
|
| 186 |
+
|
| 187 |
+
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
|
| 188 |
+
model = AutoModelForSequenceClassification.from_pretrained(
|
| 189 |
+
base_model_id, num_labels=3
|
| 190 |
+
)
|
| 191 |
+
model = PeftModel.from_pretrained(model, adapter_id)
|
| 192 |
+
model.eval()
|
| 193 |
+
|
| 194 |
+
# Classify a query
|
| 195 |
+
query = "Explain the difference between TCP and UDP"
|
| 196 |
+
inputs = tokenizer(query, return_tensors="pt", truncation=True, max_length=512)
|
| 197 |
+
outputs = model(**inputs)
|
| 198 |
+
|
| 199 |
+
labels = ["easy", "medium", "hard"]
|
| 200 |
+
predicted = labels[outputs.logits.argmax(dim=-1).item()]
|
| 201 |
+
print(f"Complexity: {predicted}")
|
| 202 |
+
# Output: Complexity: medium
|
|
|
|
|
|
|
|
|
|
| 203 |
```
|
| 204 |
|
| 205 |
+
### Using with vLLM (recommended for production)
|
| 206 |
+
|
| 207 |
+
```python
|
| 208 |
+
# The adapter can be loaded as a LoRA module in vLLM
|
| 209 |
+
# See Brick SR1 documentation for full integration guide
|
| 210 |
+
# https://github.com/regolo-ai/brick-SR1
|
| 211 |
+
```
|
| 212 |
+
|
| 213 |
+
## Integration with Brick Semantic Router
|
| 214 |
+
|
| 215 |
+
Brick Complexity Extractor is designed to work as a signal within the **Brick Semantic Router** pipeline. In a typical deployment:
|
| 216 |
+
|
| 217 |
+
1. **Query arrives** at the Brick router endpoint
|
| 218 |
+
2. **Parallel signal extraction** runs complexity classification alongside keyword matching, domain detection, and reasoning estimation
|
| 219 |
+
3. **Routing decision** combines all signals to select the optimal model from the pool
|
| 220 |
+
4. **Query forwarded** to the chosen model (e.g., Qwen 7B for easy, Llama 70B for medium, Claude for hard)
|
| 221 |
+
|
| 222 |
+
```python
|
| 223 |
+
# Brick router configuration example (brick-config.yaml)
|
| 224 |
+
signals:
|
| 225 |
+
complexity:
|
| 226 |
+
model: regolo/brick-complexity-extractor
|
| 227 |
+
weight: 0.35
|
| 228 |
+
domain:
|
| 229 |
+
model: regolo/brick-domain-classifier # coming soon
|
| 230 |
+
weight: 0.25
|
| 231 |
+
keyword:
|
| 232 |
+
type: rule-based
|
| 233 |
+
weight: 0.20
|
| 234 |
+
reasoning:
|
| 235 |
+
type: heuristic
|
| 236 |
+
weight: 0.20
|
| 237 |
+
|
| 238 |
+
model_pools:
|
| 239 |
+
easy:
|
| 240 |
+
- qwen3.5-7b
|
| 241 |
+
- llama-3.3-8b
|
| 242 |
+
medium:
|
| 243 |
+
- qwen3.5-32b
|
| 244 |
+
- llama-3.3-70b
|
| 245 |
+
hard:
|
| 246 |
+
- claude-sonnet-4-20250514
|
| 247 |
+
- deepseek-r1
|
| 248 |
+
```
|
| 249 |
+
|
| 250 |
+
## Intended Uses
|
| 251 |
+
|
| 252 |
+
### ✅ Primary Use Cases
|
| 253 |
+
- **LLM routing**: Classify query complexity to route to the optimal model tier, reducing inference cost by 30–60% compared to always-frontier routing
|
| 254 |
+
- **Reasoning budget allocation**: Decide how many reasoning tokens to allocate before inference begins
|
| 255 |
+
- **Traffic shaping**: Balance GPU load across model pools based on real-time complexity distribution
|
| 256 |
+
- **Cost monitoring**: Track complexity distribution over time to optimize fleet sizing
|
| 257 |
+
|
| 258 |
+
### ⚠️ Out-of-Scope Uses
|
| 259 |
+
- **Content moderation or safety filtering** — this model classifies cognitive difficulty, not content safety
|
| 260 |
+
- **Non-English queries** — trained on English data only; accuracy degrades significantly on other languages
|
| 261 |
+
- **Direct use as a chatbot or generative model** — this is a classification adapter, not a generative model
|
| 262 |
+
|
| 263 |
+
## Limitations
|
| 264 |
+
|
| 265 |
+
- **Label noise**: The training labels were generated by Qwen3.5-122B, not human annotators. While LLM-as-judge achieves high inter-annotator agreement on complexity, systematic biases may exist (e.g., overweighting mathematical content as "hard")
|
| 266 |
+
- **Class imbalance**: The "hard" class represents only 13.5% of training data, which may lead to lower recall on genuinely hard queries
|
| 267 |
+
- **Domain coverage**: The training set covers general-purpose user prompts. Specialized domains (medical, legal, financial) may exhibit different complexity distributions
|
| 268 |
+
- **English only**: No multilingual support in this version
|
| 269 |
+
- **Adversarial robustness**: The model has not been tested against adversarial prompt manipulation designed to fool the complexity classifier
|
| 270 |
+
|
| 271 |
+
## Training Details
|
| 272 |
+
|
| 273 |
+
| Hyperparameter | Value |
|
| 274 |
+
|---|---|
|
| 275 |
+
| **Base model** | Qwen/Qwen3.5-0.8B |
|
| 276 |
+
| **LoRA rank (r)** | 16 |
|
| 277 |
+
| **LoRA alpha (α)** | 32 |
|
| 278 |
+
| **LoRA dropout** | 0.05 |
|
| 279 |
+
| **Target modules** | q_proj, v_proj |
|
| 280 |
+
| **Learning rate** | 2e-4 |
|
| 281 |
+
| **Batch size** | 32 |
|
| 282 |
+
| **Epochs** | 3 |
|
| 283 |
+
| **Optimizer** | AdamW |
|
| 284 |
+
| **Scheduler** | Cosine with warmup (5% steps) |
|
| 285 |
+
| **Max sequence length** | 512 tokens |
|
| 286 |
+
| **Training samples** | 65,307 |
|
| 287 |
+
| **Validation samples** | 7,683 |
|
| 288 |
+
| **Test samples** | 3,841 |
|
| 289 |
+
| **Training hardware** | 1× NVIDIA A100 80GB |
|
| 290 |
+
| **Training time** | ~2 hours |
|
| 291 |
+
| **Framework** | PyTorch + HuggingFace PEFT |
|
| 292 |
+
|
| 293 |
+
## Environmental Impact
|
| 294 |
+
|
| 295 |
+
Regolo.ai is committed to sustainable AI. This model was trained on GPU infrastructure powered by [Seeweb](https://www.seeweb.it/)'s data centers in Italy, which run on certified renewable energy.
|
| 296 |
+
|
| 297 |
+
| Metric | Value |
|
| 298 |
+
|---|---|
|
| 299 |
+
| **Hardware** | 1× NVIDIA A100 80GB |
|
| 300 |
+
| **Training duration** | ~2 hours |
|
| 301 |
+
| **Estimated CO₂** | < 0.5 kg CO₂eq |
|
| 302 |
+
| **Energy source** | Renewable (certified) |
|
| 303 |
+
| **Location** | Italy (EU) |
|
| 304 |
+
|
| 305 |
+
## Citation
|
| 306 |
+
|
| 307 |
+
```bibtex
|
| 308 |
+
@misc{regolo2026brick-complexity,
|
| 309 |
+
title = {Brick Complexity Extractor: A LoRA Adapter for Query Complexity Classification in LLM Routing},
|
| 310 |
+
author = {Regolo.ai Team},
|
| 311 |
+
year = {2026},
|
| 312 |
+
url = {https://huggingface.co/regolo/brick-complexity-extractor}
|
| 313 |
+
}
|
| 314 |
+
```
|
| 315 |
+
|
| 316 |
+
## About Regolo.ai
|
| 317 |
+
|
| 318 |
+
[Regolo.ai](https://regolo.ai) is the EU-sovereign LLM inference platform built on [Seeweb](https://www.seeweb.it/) infrastructure. We provide zero-data-retention, GDPR-native AI inference for enterprises that need privacy, compliance, and performance — all from European data centers powered by renewable energy.
|
| 319 |
+
|
| 320 |
+
**Brick** is our open-source semantic routing system that intelligently distributes queries across model pools, optimizing for cost, latency, and quality.
|
| 321 |
+
|
| 322 |
+
<div align="center">
|
| 323 |
+
|
| 324 |
+
**[Website](https://regolo.ai) · [Docs](https://docs.regolo.ai) · [Discord](https://discord.gg/myuuVFcfJw) · [GitHub](https://github.com/regolo-ai) · [LinkedIn](https://www.linkedin.com/company/regolo-ai/)**
|
| 325 |
|
| 326 |
+
</div>
|