--- language: - en license: other license_name: nvidia-open-model-license license_link: https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/ base_model: nvidia/Nemotron-3-Nano-30B-A3B tags: - telecommunications - 3gpp - o-ran - ietf - telecom - peft - lora - nemotron - mixture-of-experts - gsma - network-slicing - anomaly-detection - srsran pipeline_tag: text-generation library_name: transformers model-index: - name: AdaptKey-Nemotron-30b results: - task: type: text-generation name: Telecom Domain Benchmark metrics: - type: accuracy value: 596 name: GSMA Open-Telco Composite Score (vs Baseline 538) --- # AdaptKey/AdaptKey-Nemotron-30b ## Overview **AdaptKey-Nemotron-30b** is a LoRA fine-tuned version of NVIDIA's Nemotron-3-Nano-30B model, specialized for telecommunications and network engineering applications. The model was trained on 1.3M+ telecom domain examples covering 3GPP standards, IETF protocols, network traces, anomaly detection, and network function configuration. This model achieved a **composite benchmark score of 596** — a **+58 point improvement (+10.8%)** over the NVIDIA Nemotron-3-Nano-30B-A3B baseline of 538 — while using conservative anti-forgetting training strategies to preserve general capabilities. ## Benchmark Results Evaluated via the **TeleFlow** evaluation system on 2/9/2026. See [Evaluation Methodology](#evaluation-methodology) below for full details on scoring. | Model | TeLogs | TeleMath | TeleQnA | 3GPPTSG | TeleYaml | TeleTables | srsRAN | ORAN | **Total** | |---|---|---|---|---|---|---|---|---|---| | **Baseline** — NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | 48.8 | 66.4 | 86.1 | 44 | 62.5 | 61 | 85 | 84.1 | **538** | | **AdaptKey-Nemotron-30b** (this model) | **61.6** | **74** | **88.2** | **48** | **79.3** | **72.8** | **86** | **86.4** | **596** | | **Δ improvement** | +12.8 | +7.6 | +2.1 | +4.0 | +16.8 | +11.8 | +1.0 | +2.3 | **+58** | ### Strongest Gains - **TeleYaml** +16.8 pts (+26.9%) — structured YAML generation for network configs - **TeLogs** +12.8 pts (+26.2%) — network log analysis and fault diagnosis - **TeleTables** +11.8 pts (+19.3%) — tabular reasoning over network parameters --- ## Evaluation Methodology ### Overview Adaptkey uses a two-tier scoring system designed to minimize judge cost while maximizing evaluation accuracy: 1. **Deterministic scoring** — applied first whenever the answer is objectively verifiable (exact-match multiple choice, numeric answers). Scores are 10 (correct) or 0 (incorrect). The LLM judge is skipped entirely for these cases, eliminating variance and cost. 2. **LLM-as-a-Judge** — invoked for all remaining responses where deterministic checking cannot conclusively score quality. ### Judge Model | Property | Value | |---|---| | Model | `openai/gpt-oss-120b` | | Temperature | 0.1 (near-deterministic for consistency) | | Max output tokens | 300 | | Output format | Structured JSON `{"score": , "reasoning": ""}` | ### Scoring Rubrics Two rubrics are applied depending on benchmark type: #### Rubric A — Free-Text Technical Answers *Applied to: TeleQnA, TeleMath, TeleLogs, TSG-3GPP* The judge evaluates three criteria simultaneously: - **Factual Accuracy** — Are the key technical facts correct? - **Completeness** — Does the response cover the main points from the reference answer? - **Correctness** — Are there any incorrect statements that would mislead an engineer? | Score | Interpretation | |---|---| | 10 | All key facts present and correct | | 7–9 | Mostly correct, minor omissions or imprecisions | | 4–6 | Partially correct, some important errors or omissions | | 1–3 | Mostly incorrect or very incomplete | | 0 | Completely wrong, off-topic, or empty | #### Rubric B — Structured Configuration Answers *Applied to: TeleYaml, TeleTables* The judge evaluates two weighted axes: - **Structural Validity (40%)** — Is the output a valid configuration with correct syntax? - **Content Accuracy (60%)** — Do field names and values match the expected configuration? Partial credit awarded proportionally based on ratio of correct fields to total fields. | Score | Interpretation | |---|---| | 10 | Perfect match — all fields correct | | 8–9 | Valid structure, 1–2 minor value differences | | 5–7 | Valid structure, several wrong values or missing fields | | 1–4 | Invalid structure or mostly wrong | | 0 | Empty, completely wrong, or unparseable | ### Judge Prompt Structure Each judge invocation consists of two messages: **System message:** ``` You are a strict telecom evaluation judge. Score accurately based on the rubric. Output ONLY the JSON object. ``` **User message:** ``` Question: {question} Reference Answer: {reference_answer} Model Response: {model_response} Scoring Rubric: {applicable_rubric} Output JSON: {"score": <0-10>, "reasoning": ""} ``` ### Retry Policy If the judge scores a response below a configurable threshold, the model is re-prompted up to **5 times**. The **best score across all attempts** is recorded. This measures the model's capability ceiling rather than single-shot performance, and is applied consistently across all models evaluated including the baseline. ### Benchmark-to-Rubric Mapping | Benchmark | Rubric | Deterministic Bypass | |---|---|---| | TeleQnA | A — Free-Text Technical | Where multiple-choice | | TeleMath | A — Free-Text Technical | Numeric exact-match | | TeleLogs | A — Free-Text Technical | Classification labels | | TSG-3GPP | A — Free-Text Technical | Where multiple-choice | | TeleYaml | B — Structured Configuration | N/A | | TeleTables | B — Structured Configuration | N/A | | srsRAN | A — Free-Text Technical | Where multiple-choice | | ORAN | A — Free-Text Technical | Where multiple-choice | ## What We Did - **Goal**: Create a specialized telecom AI assistant with expert-level knowledge of 3GPP, IETF, ITU, and TM Forum standards - **Approach**: LoRA fine-tuning with conservative hyperparameters to prevent catastrophic forgetting - **Dataset**: 1.3M+ telecom Q&A examples with augmented network slicing and network function configuration data - **Base model**: NVIDIA Nemotron-3-Nano-30B-A3B (Megatron format) ## Training Data ### Dataset Composition (~1.31M examples) | Split | Examples | |---|---| | Train | 1,303,277 | | Validation | 5,000 | | Test | 5,000 | | **Total** | **1,313,277** | ### Domain Coverage - **Network Traces & Anomaly Detection**: 5G trace analysis, KPI statistics, anomaly classification - **Network Slicing**: S-NSSAI configuration, slice types (eMBB, URLLC, mMTC), resource allocation - **Network Function Configuration**: Open5GS YAML generation, AMF/SMF/UPF configuration - **3GPP Standards Q&A**: Core network procedures, RAN protocols, signaling - **Network Forecasting**: Trend analysis, traffic prediction - **Troubleshooting**: Root cause analysis, diagnostic procedures ### Data Format ```json { "input": "System: You are an expert telecommunications engineer...\nUser: [question with context]", "output": "[detailed answer with reasoning]" } ``` ## Training Details ### LoRA Hyperparameters | Parameter | Value | Notes | |---|---|---| | LoRA dim (rank) | 64 | Adapter capacity | | LoRA alpha | 128 | 2:1 ratio for gentler gradient flow | | LoRA dropout | 0.1 | Regularization to prevent overfitting | | Target modules | linear_qkv, linear_proj, linear_fc1, linear_fc2, in_proj, out_proj | Mamba + MLP layers | ### Training Configuration | Parameter | Value | Notes | |---|---|---| | Base model | Nemotron-3-Nano-30B-A3B (Megatron) | | | Training iterations | 10,500 | ~1.03 epochs | | Learning rate | 5e-5 | Conservative to prevent forgetting | | LR warmup | 525 steps | 5% of total iterations | | LR decay | Cosine to 10,500 | | | Global batch size | 128 | | | Micro batch size | 4 | Per GPU | | Gradient accumulation | 8 steps | | | Max sequence length | 2,048 | | | Precision | BF16 | | | Checkpoint interval | 1,000 steps | | ### Infrastructure | Property | Value | |---|---| | Hardware | 4x NVIDIA H100 NVL 94GB (NVLink connected) | | Framework | NeMo/Megatron-Bridge with custom LoRA wrapper | | Container | `nvcr.io/nvidia/nemo:25.11.nemotron_3_nano` | | Training time | 84 hours | ### Parallelism | Parameter | Value | |---|---| | Expert parallel | 4 | | Tensor parallel | 1 | | Pipeline parallel | 1 | | MoE token dispatcher | alltoall | ## Training Progress | Checkpoint | Train Loss | Val Loss | Val PPL | |---|---|---|---| | iter 500 | 0.402 | 0.242 | 1.274 | | iter 1000 | 0.367 | 0.145 | 1.156 | | iter 1500 | 0.381 | 0.118 | 1.125 | | iter 2000 | 0.432 | 0.130 | 1.139 | | iter 2500 | 0.377 | 0.139 | 1.149 | | iter 3000 | 0.391 | 0.108 | 1.114 | | **iter 10500 (final)** | **0.356** | **0.150** | **1.162** | ## Version History | Version | Dataset Size | Val Loss | Val PPL | Benchmark | |---|---|---|---|---| | **AdaptKey-Nemotron-30b** (this model) | **1,303,277** | **0.150** | **1.162** | **596 composite** | ### Key Improvements in This Version - Augmented network slicing examples to address weak benchmark performance - Enhanced network function configuration coverage - Improved system prompts (removed misleading "telco expert" framing for non-telco questions) - +10.8% absolute improvement on composite benchmark over NVIDIA baseline ## Post-Training Pipeline ```bash # Merge LoRA weights torchrun --nproc-per-node=4 \ /opt/Megatron-Bridge/examples/peft/merge_lora.py \ --lora-checkpoint /models/AdaptKey-Nemotron-30b-lora/iter_0010500 \ --hf-model-path /models/nemotron-30b \ --output /models/AdaptKey-Nemotron-30b-merged # Export to HuggingFace format python /opt/Megatron-Bridge/examples/conversion/convert_checkpoints.py export \ --hf-model /models/nemotron-30b \ --megatron-path /models/AdaptKey-Nemotron-30b-merged \ --hf-path /models/AdaptKey-Nemotron-30b-hf-export ``` ## Usage ### With Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "AdaptKey/AdaptKey-Nemotron-30b", trust_remote_code=True, torch_dtype="bfloat16", ) tokenizer = AutoTokenizer.from_pretrained( "AdaptKey/AdaptKey-Nemotron-30b", trust_remote_code=True, ) prompt = """System: You are an expert telecommunications engineer. Answer questions accurately based on your knowledge of telecom standards (3GPP, IETF, ITU, TM Forum). User: Explain the difference between eMBB, URLLC, and mMTC slice types in 5G network slicing.""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### With vLLM ```python from vllm import LLM, SamplingParams llm = LLM( model="AdaptKey/AdaptKey-Nemotron-30b", trust_remote_code=True, tensor_parallel_size=1, gpu_memory_utilization=0.90, ) sampling_params = SamplingParams(temperature=0.7, max_tokens=512) outputs = llm.generate([prompt], sampling_params) ``` ### Docker Compose (vLLM Server) ```yaml services: vllm-adaptkey: image: vllm/vllm-openai:latest container_name: vllm-adaptkey-nemotron-30b runtime: nvidia environment: - NVIDIA_VISIBLE_DEVICES=0 ports: - "8090:8000" volumes: - /opt/models:/models:ro command: > --model /models/AdaptKey-Nemotron-30b --trust-remote-code --max-model-len 8196 --gpu-memory-utilization 0.90 --tensor-parallel-size 1 restart: unless-stopped ``` ## Lessons Learned 1. **Anti-forgetting strategy works**: Conservative LoRA params (64/128/0.1) with 5e-5 LR preserved general capabilities 2. **Data quality matters more than quantity**: Improving weak-area examples had more impact than adding more data 3. **System prompt alignment**: Mismatched system prompts (e.g., "telco expert" for ethics questions) hurt performance 4. **Mixed datasets**: Combining diverse telecom subcategories prevents narrow specialization ## License This model is derived from NVIDIA's Nemotron-3-Nano-30B and is subject to the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/). Please review the license terms before use in commercial applications. ## Citation ```bibtex @misc{adaptkey_nemotron_30b_2026, title={AdaptKey-Nemotron-30b: A Telecom-Specialized Language Model}, author={AdaptKey}, year={2026}, publisher={HuggingFace}, url={https://huggingface.co/AdaptKey/AdaptKey-Nemotron-30b} } ```