| | --- |
| | language: |
| | - en |
| | license: other |
| | license_name: nvidia-open-model-license |
| | license_link: https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/ |
| | base_model: nvidia/Nemotron-3-Nano-30B-A3B |
| | tags: |
| | - telecommunications |
| | - 3gpp |
| | - o-ran |
| | - ietf |
| | - telecom |
| | - peft |
| | - lora |
| | - nemotron |
| | - mixture-of-experts |
| | - gsma |
| | - network-slicing |
| | - anomaly-detection |
| | - srsran |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | model-index: |
| | - name: AdaptKey-Nemotron-30b |
| | results: |
| | - task: |
| | type: text-generation |
| | name: Telecom Domain Benchmark |
| | metrics: |
| | - type: accuracy |
| | value: 596 |
| | name: GSMA Open-Telco Composite Score (vs Baseline 538) |
| | --- |
| | |
| | # AdaptKey/AdaptKey-Nemotron-30b |
| |
|
| | ## Overview |
| |
|
| | **AdaptKey-Nemotron-30b** is a LoRA fine-tuned version of NVIDIA's Nemotron-3-Nano-30B model, specialized for telecommunications and network engineering applications. The model was trained on 1.3M+ telecom domain examples covering 3GPP standards, IETF protocols, network traces, anomaly detection, and network function configuration. |
| |
|
| | This model achieved a **composite benchmark score of 596** — a **+58 point improvement (+10.8%)** over the NVIDIA Nemotron-3-Nano-30B-A3B baseline of 538 — while using conservative anti-forgetting training strategies to preserve general capabilities. |
| |
|
| | ## Benchmark Results |
| |
|
| | Evaluated via the **TeleFlow** evaluation system on 2/9/2026. See [Evaluation Methodology](#evaluation-methodology) below for full details on scoring. |
| |
|
| | | Model | TeLogs | TeleMath | TeleQnA | 3GPPTSG | TeleYaml | TeleTables | srsRAN | ORAN | **Total** | |
| | |---|---|---|---|---|---|---|---|---|---| |
| | | **Baseline** — NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | 48.8 | 66.4 | 86.1 | 44 | 62.5 | 61 | 85 | 84.1 | **538** | |
| | | **AdaptKey-Nemotron-30b** (this model) | **61.6** | **74** | **88.2** | **48** | **79.3** | **72.8** | **86** | **86.4** | **596** | |
| | | **Δ improvement** | +12.8 | +7.6 | +2.1 | +4.0 | +16.8 | +11.8 | +1.0 | +2.3 | **+58** | |
| |
|
| | ### Strongest Gains |
| | - **TeleYaml** +16.8 pts (+26.9%) — structured YAML generation for network configs |
| | - **TeLogs** +12.8 pts (+26.2%) — network log analysis and fault diagnosis |
| | - **TeleTables** +11.8 pts (+19.3%) — tabular reasoning over network parameters |
| |
|
| | --- |
| |
|
| | ## Evaluation Methodology |
| |
|
| | ### Overview |
| |
|
| | Adaptkey uses a two-tier scoring system designed to minimize judge cost while maximizing evaluation accuracy: |
| |
|
| | 1. **Deterministic scoring** — applied first whenever the answer is objectively verifiable (exact-match multiple choice, numeric answers). Scores are 10 (correct) or 0 (incorrect). The LLM judge is skipped entirely for these cases, eliminating variance and cost. |
| | 2. **LLM-as-a-Judge** — invoked for all remaining responses where deterministic checking cannot conclusively score quality. |
| |
|
| | ### Judge Model |
| |
|
| | | Property | Value | |
| | |---|---| |
| | | Model | `openai/gpt-oss-120b` | |
| | | Temperature | 0.1 (near-deterministic for consistency) | |
| | | Max output tokens | 300 | |
| | | Output format | Structured JSON `{"score": <int>, "reasoning": "<str>"}` | |
| |
|
| | ### Scoring Rubrics |
| |
|
| | Two rubrics are applied depending on benchmark type: |
| |
|
| | #### Rubric A — Free-Text Technical Answers |
| | *Applied to: TeleQnA, TeleMath, TeleLogs, TSG-3GPP* |
| |
|
| | The judge evaluates three criteria simultaneously: |
| | - **Factual Accuracy** — Are the key technical facts correct? |
| | - **Completeness** — Does the response cover the main points from the reference answer? |
| | - **Correctness** — Are there any incorrect statements that would mislead an engineer? |
| |
|
| | | Score | Interpretation | |
| | |---|---| |
| | | 10 | All key facts present and correct | |
| | | 7–9 | Mostly correct, minor omissions or imprecisions | |
| | | 4–6 | Partially correct, some important errors or omissions | |
| | | 1–3 | Mostly incorrect or very incomplete | |
| | | 0 | Completely wrong, off-topic, or empty | |
| |
|
| | #### Rubric B — Structured Configuration Answers |
| | *Applied to: TeleYaml, TeleTables* |
| |
|
| | The judge evaluates two weighted axes: |
| | - **Structural Validity (40%)** — Is the output a valid configuration with correct syntax? |
| | - **Content Accuracy (60%)** — Do field names and values match the expected configuration? Partial credit awarded proportionally based on ratio of correct fields to total fields. |
| |
|
| | | Score | Interpretation | |
| | |---|---| |
| | | 10 | Perfect match — all fields correct | |
| | | 8–9 | Valid structure, 1–2 minor value differences | |
| | | 5–7 | Valid structure, several wrong values or missing fields | |
| | | 1–4 | Invalid structure or mostly wrong | |
| | | 0 | Empty, completely wrong, or unparseable | |
| |
|
| | ### Judge Prompt Structure |
| |
|
| | Each judge invocation consists of two messages: |
| |
|
| | **System message:** |
| | ``` |
| | You are a strict telecom evaluation judge. Score accurately based on the rubric. |
| | Output ONLY the JSON object. |
| | ``` |
| |
|
| | **User message:** |
| | ``` |
| | Question: {question} |
| | |
| | Reference Answer: {reference_answer} |
| | |
| | Model Response: {model_response} |
| | |
| | Scoring Rubric: |
| | {applicable_rubric} |
| | |
| | Output JSON: {"score": <0-10>, "reasoning": "<brief explanation>"} |
| | ``` |
| |
|
| | ### Retry Policy |
| |
|
| | If the judge scores a response below a configurable threshold, the model is re-prompted up to **5 times**. The **best score across all attempts** is recorded. This measures the model's capability ceiling rather than single-shot performance, and is applied consistently across all models evaluated including the baseline. |
| |
|
| | ### Benchmark-to-Rubric Mapping |
| |
|
| | | Benchmark | Rubric | Deterministic Bypass | |
| | |---|---|---| |
| | | TeleQnA | A — Free-Text Technical | Where multiple-choice | |
| | | TeleMath | A — Free-Text Technical | Numeric exact-match | |
| | | TeleLogs | A — Free-Text Technical | Classification labels | |
| | | TSG-3GPP | A — Free-Text Technical | Where multiple-choice | |
| | | TeleYaml | B — Structured Configuration | N/A | |
| | | TeleTables | B — Structured Configuration | N/A | |
| | | srsRAN | A — Free-Text Technical | Where multiple-choice | |
| | | ORAN | A — Free-Text Technical | Where multiple-choice | |
| |
|
| | ## What We Did |
| |
|
| | - **Goal**: Create a specialized telecom AI assistant with expert-level knowledge of 3GPP, IETF, ITU, and TM Forum standards |
| | - **Approach**: LoRA fine-tuning with conservative hyperparameters to prevent catastrophic forgetting |
| | - **Dataset**: 1.3M+ telecom Q&A examples with augmented network slicing and network function configuration data |
| | - **Base model**: NVIDIA Nemotron-3-Nano-30B-A3B (Megatron format) |
| |
|
| | ## Training Data |
| |
|
| | ### Dataset Composition (~1.31M examples) |
| |
|
| | | Split | Examples | |
| | |---|---| |
| | | Train | 1,303,277 | |
| | | Validation | 5,000 | |
| | | Test | 5,000 | |
| | | **Total** | **1,313,277** | |
| |
|
| | ### Domain Coverage |
| |
|
| | - **Network Traces & Anomaly Detection**: 5G trace analysis, KPI statistics, anomaly classification |
| | - **Network Slicing**: S-NSSAI configuration, slice types (eMBB, URLLC, mMTC), resource allocation |
| | - **Network Function Configuration**: Open5GS YAML generation, AMF/SMF/UPF configuration |
| | - **3GPP Standards Q&A**: Core network procedures, RAN protocols, signaling |
| | - **Network Forecasting**: Trend analysis, traffic prediction |
| | - **Troubleshooting**: Root cause analysis, diagnostic procedures |
| |
|
| | ### Data Format |
| |
|
| | ```json |
| | { |
| | "input": "System: You are an expert telecommunications engineer...\nUser: [question with context]", |
| | "output": "[detailed answer with reasoning]" |
| | } |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | ### LoRA Hyperparameters |
| |
|
| | | Parameter | Value | Notes | |
| | |---|---|---| |
| | | LoRA dim (rank) | 64 | Adapter capacity | |
| | | LoRA alpha | 128 | 2:1 ratio for gentler gradient flow | |
| | | LoRA dropout | 0.1 | Regularization to prevent overfitting | |
| | | Target modules | linear_qkv, linear_proj, linear_fc1, linear_fc2, in_proj, out_proj | Mamba + MLP layers | |
| |
|
| | ### Training Configuration |
| |
|
| | | Parameter | Value | Notes | |
| | |---|---|---| |
| | | Base model | Nemotron-3-Nano-30B-A3B (Megatron) | | |
| | | Training iterations | 10,500 | ~1.03 epochs | |
| | | Learning rate | 5e-5 | Conservative to prevent forgetting | |
| | | LR warmup | 525 steps | 5% of total iterations | |
| | | LR decay | Cosine to 10,500 | | |
| | | Global batch size | 128 | | |
| | | Micro batch size | 4 | Per GPU | |
| | | Gradient accumulation | 8 steps | | |
| | | Max sequence length | 2,048 | | |
| | | Precision | BF16 | | |
| | | Checkpoint interval | 1,000 steps | | |
| |
|
| | ### Infrastructure |
| |
|
| | | Property | Value | |
| | |---|---| |
| | | Hardware | 4x NVIDIA H100 NVL 94GB (NVLink connected) | |
| | | Framework | NeMo/Megatron-Bridge with custom LoRA wrapper | |
| | | Container | `nvcr.io/nvidia/nemo:25.11.nemotron_3_nano` | |
| | | Training time | 84 hours | |
| |
|
| | ### Parallelism |
| |
|
| | | Parameter | Value | |
| | |---|---| |
| | | Expert parallel | 4 | |
| | | Tensor parallel | 1 | |
| | | Pipeline parallel | 1 | |
| | | MoE token dispatcher | alltoall | |
| |
|
| | ## Training Progress |
| |
|
| | | Checkpoint | Train Loss | Val Loss | Val PPL | |
| | |---|---|---|---| |
| | | iter 500 | 0.402 | 0.242 | 1.274 | |
| | | iter 1000 | 0.367 | 0.145 | 1.156 | |
| | | iter 1500 | 0.381 | 0.118 | 1.125 | |
| | | iter 2000 | 0.432 | 0.130 | 1.139 | |
| | | iter 2500 | 0.377 | 0.139 | 1.149 | |
| | | iter 3000 | 0.391 | 0.108 | 1.114 | |
| | | **iter 10500 (final)** | **0.356** | **0.150** | **1.162** | |
| |
|
| | ## Version History |
| |
|
| | | Version | Dataset Size | Val Loss | Val PPL | Benchmark | |
| | |---|---|---|---|---| |
| | | **AdaptKey-Nemotron-30b** (this model) | **1,303,277** | **0.150** | **1.162** | **596 composite** | |
| |
|
| | ### Key Improvements in This Version |
| |
|
| | - Augmented network slicing examples to address weak benchmark performance |
| | - Enhanced network function configuration coverage |
| | - Improved system prompts (removed misleading "telco expert" framing for non-telco questions) |
| | - +10.8% absolute improvement on composite benchmark over NVIDIA baseline |
| |
|
| | ## Post-Training Pipeline |
| |
|
| | ```bash |
| | # Merge LoRA weights |
| | torchrun --nproc-per-node=4 \ |
| | /opt/Megatron-Bridge/examples/peft/merge_lora.py \ |
| | --lora-checkpoint /models/AdaptKey-Nemotron-30b-lora/iter_0010500 \ |
| | --hf-model-path /models/nemotron-30b \ |
| | --output /models/AdaptKey-Nemotron-30b-merged |
| | |
| | # Export to HuggingFace format |
| | python /opt/Megatron-Bridge/examples/conversion/convert_checkpoints.py export \ |
| | --hf-model /models/nemotron-30b \ |
| | --megatron-path /models/AdaptKey-Nemotron-30b-merged \ |
| | --hf-path /models/AdaptKey-Nemotron-30b-hf-export |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | ### With Transformers |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained( |
| | "AdaptKey/AdaptKey-Nemotron-30b", |
| | trust_remote_code=True, |
| | torch_dtype="bfloat16", |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained( |
| | "AdaptKey/AdaptKey-Nemotron-30b", |
| | trust_remote_code=True, |
| | ) |
| | |
| | prompt = """System: You are an expert telecommunications engineer. Answer questions accurately based on your knowledge of telecom standards (3GPP, IETF, ITU, TM Forum). |
| | |
| | User: Explain the difference between eMBB, URLLC, and mMTC slice types in 5G network slicing.""" |
| | |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | outputs = model.generate(**inputs, max_new_tokens=512) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ### With vLLM |
| |
|
| | ```python |
| | from vllm import LLM, SamplingParams |
| | |
| | llm = LLM( |
| | model="AdaptKey/AdaptKey-Nemotron-30b", |
| | trust_remote_code=True, |
| | tensor_parallel_size=1, |
| | gpu_memory_utilization=0.90, |
| | ) |
| | |
| | sampling_params = SamplingParams(temperature=0.7, max_tokens=512) |
| | outputs = llm.generate([prompt], sampling_params) |
| | ``` |
| |
|
| | ### Docker Compose (vLLM Server) |
| |
|
| | ```yaml |
| | services: |
| | vllm-adaptkey: |
| | image: vllm/vllm-openai:latest |
| | container_name: vllm-adaptkey-nemotron-30b |
| | runtime: nvidia |
| | environment: |
| | - NVIDIA_VISIBLE_DEVICES=0 |
| | ports: |
| | - "8090:8000" |
| | volumes: |
| | - /opt/models:/models:ro |
| | command: > |
| | --model /models/AdaptKey-Nemotron-30b |
| | --trust-remote-code |
| | --max-model-len 8196 |
| | --gpu-memory-utilization 0.90 |
| | --tensor-parallel-size 1 |
| | restart: unless-stopped |
| | ``` |
| |
|
| | ## Lessons Learned |
| |
|
| | 1. **Anti-forgetting strategy works**: Conservative LoRA params (64/128/0.1) with 5e-5 LR preserved general capabilities |
| | 2. **Data quality matters more than quantity**: Improving weak-area examples had more impact than adding more data |
| | 3. **System prompt alignment**: Mismatched system prompts (e.g., "telco expert" for ethics questions) hurt performance |
| | 4. **Mixed datasets**: Combining diverse telecom subcategories prevents narrow specialization |
| |
|
| |
|
| | ## License |
| |
|
| | This model is derived from NVIDIA's Nemotron-3-Nano-30B and is subject to the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/). Please review the license terms before use in commercial applications. |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{adaptkey_nemotron_30b_2026, |
| | title={AdaptKey-Nemotron-30b: A Telecom-Specialized Language Model}, |
| | author={AdaptKey}, |
| | year={2026}, |
| | publisher={HuggingFace}, |
| | url={https://huggingface.co/AdaptKey/AdaptKey-Nemotron-30b} |
| | } |
| | ``` |