regolo
/

brick-complexity-2-max

@@ -87,6 +87,34 @@ print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True).strip())
 # Output: hard
 ```
 ## About Brick
 [Regolo.ai](https://regolo.ai) is the EU-sovereign LLM inference platform built on [Seeweb](https://www.seeweb.it/) infrastructure. **Brick** is our open-source semantic routing system that intelligently distributes queries across model pools, optimizing for cost, latency, and quality.

 # Output: hard
 ```
+## Usage (vLLM)
+```python
+from vllm import LLM, SamplingParams
+from vllm.lora.request import LoRARequest
+llm = LLM(
+    model="Qwen/Qwen3.5-0.8B",
+    enable_lora=True,
+    max_lora_rank=32,
+    dtype="bfloat16",
+)
+sp = SamplingParams(temperature=0, max_tokens=3)
+system = """You are a query difficulty classifier for an LLM routing system.
+Classify each query as easy, medium, or hard based on the cognitive depth and domain expertise required to answer correctly.
+Respond with ONLY one word: easy, medium, or hard."""
+prompt = f"<|im_start|>system\n{system}<|im_end|>\n<|im_start|>user\nClassify: Explain the rendering equation from radiometric first principles<|im_end|>\n<|im_start|>assistant\n"
+out = llm.generate(
+    [prompt],
+    sp,
+    lora_request=LoRARequest("brick-complexity-2-max", 1, "regolo/brick-complexity-2-max"),
+)
+print(out[0].outputs[0].text.strip())
+# Output: hard
+```
 ## About Brick
 [Regolo.ai](https://regolo.ai) is the EU-sovereign LLM inference platform built on [Seeweb](https://www.seeweb.it/) infrastructure. **Brick** is our open-source semantic routing system that intelligently distributes queries across model pools, optimizing for cost, latency, and quality.