Text Classification
PEFT
Safetensors
English
lora
complexity-classification
llm-routing
query-difficulty
brick
semantic-router
inference-optimization
cost-reduction
reasoning-budget
Instructions to use regolo/brick-complexity-2-max with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use regolo/brick-complexity-2-max with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-0.8B") model = PeftModel.from_pretrained(base_model, "regolo/brick-complexity-2-max") - Notebooks
- Google Colab
- Kaggle
Update model card: remove specific-LLM references, clarify variant purpose
Browse files
README.md
CHANGED
|
@@ -87,6 +87,34 @@ print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True).strip())
|
|
| 87 |
# Output: hard
|
| 88 |
```
|
| 89 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
## About Brick
|
| 91 |
|
| 92 |
[Regolo.ai](https://regolo.ai) is the EU-sovereign LLM inference platform built on [Seeweb](https://www.seeweb.it/) infrastructure. **Brick** is our open-source semantic routing system that intelligently distributes queries across model pools, optimizing for cost, latency, and quality.
|
|
|
|
| 87 |
# Output: hard
|
| 88 |
```
|
| 89 |
|
| 90 |
+
## Usage (vLLM)
|
| 91 |
+
|
| 92 |
+
```python
|
| 93 |
+
from vllm import LLM, SamplingParams
|
| 94 |
+
from vllm.lora.request import LoRARequest
|
| 95 |
+
|
| 96 |
+
llm = LLM(
|
| 97 |
+
model="Qwen/Qwen3.5-0.8B",
|
| 98 |
+
enable_lora=True,
|
| 99 |
+
max_lora_rank=32,
|
| 100 |
+
dtype="bfloat16",
|
| 101 |
+
)
|
| 102 |
+
sp = SamplingParams(temperature=0, max_tokens=3)
|
| 103 |
+
|
| 104 |
+
system = """You are a query difficulty classifier for an LLM routing system.
|
| 105 |
+
Classify each query as easy, medium, or hard based on the cognitive depth and domain expertise required to answer correctly.
|
| 106 |
+
Respond with ONLY one word: easy, medium, or hard."""
|
| 107 |
+
prompt = f"<|im_start|>system\n{system}<|im_end|>\n<|im_start|>user\nClassify: Explain the rendering equation from radiometric first principles<|im_end|>\n<|im_start|>assistant\n"
|
| 108 |
+
|
| 109 |
+
out = llm.generate(
|
| 110 |
+
[prompt],
|
| 111 |
+
sp,
|
| 112 |
+
lora_request=LoRARequest("brick-complexity-2-max", 1, "regolo/brick-complexity-2-max"),
|
| 113 |
+
)
|
| 114 |
+
print(out[0].outputs[0].text.strip())
|
| 115 |
+
# Output: hard
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
## About Brick
|
| 119 |
|
| 120 |
[Regolo.ai](https://regolo.ai) is the EU-sovereign LLM inference platform built on [Seeweb](https://www.seeweb.it/) infrastructure. **Brick** is our open-source semantic routing system that intelligently distributes queries across model pools, optimizing for cost, latency, and quality.
|