DevQuasar/llm_router_dataset-synth
Viewer • Updated • 20.2k • 95 • 9
How to use anthonyivn/ModernBERT-Base-llm-router with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="anthonyivn/ModernBERT-Base-llm-router") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("anthonyivn/ModernBERT-Base-llm-router")
model = AutoModelForSequenceClassification.from_pretrained("anthonyivn/ModernBERT-Base-llm-router")This model is a fine-tuned version of the answerdotai/ModernBERT-large model using the DevQuasar/llm_router_dataset-synth dataset.
The fine-tuned model achieves the following results on the test set:
This model was trained using a RTX 4090
See original answerdotai/ModernBERT-base model card for additional information. This model is intended to classify queries for LLM routing. where advanced/complicated queries are labeled as 1 (large_llm) and simpler queries are labeled as 0 (small_llm).
The following hyperparameters were used during training:
GITHUB URL TO BE ADDED
| Epoch | Validation Loss | F1 |
|---|---|---|
| 1.0 | 0.0296 | 0.9907 |
| 2.0 | 0.0327 | 0.9911 |
| 3.0 | 0.0474 | 0.9933 |
| 4.0 | 0.0563 | 0.9933 |
| 5.0 | 0.0554 | 0.9933 |
Base model
answerdotai/ModernBERT-base