KU-DFI
/

TelecomGPT-R1

Safetensors

qwen3_5

Model card Files Files and versions

xet

Community

wbhVince829 commited on 24 days ago

Commit

8173abd

1 Parent(s): d13b01d

update quickstart and teletable

Browse files

Files changed (1) hide show

README.md +81 -66

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ license: apache-2.0
 - **Among open-source models**, TelecomGPT-R1 leads DeepSeek-V3-0324 (685B) by **+30.3**, LLaMA-3.3-70B by **+34.9**, and Qwen2.5-72B by **+35.6**, while operating at roughly **25× fewer active parameters than the next-best open entrant**.
 - **Among closed-source models**, TelecomGPT-R1 reaches SOTA performance across both the general-purpose frontier tier and the telecom-specialized tier, as detailed in the two bullets below.
 - **Among general-purpose frontier models**, TelecomGPT-R1 leads Gemini-3.1-Pro by **+14.0**, Claude-Opus-4.6 by **+16.3**, and GPT-5 by **+17.7**. These systems sit at the **trillion-parameter-class frontier** (active-parameter counts are not publicly disclosed but are widely reported as orders of magnitude larger than 27B), making the margin a parameter-efficiency result as much as an accuracy result.
-- **Among telecom-specialized models**, TelecomGPT-R1 is **on par with the leading closed operator-internal telecom model AT&T's OTel-LLM-8.3B-QnA**, and leads SoftBank LTM by **+16.0**, demonstrating that an open telecom reasoning model can reach SOTA performance alongside top operator-internal baselines on the GSMA Open Telco Leaderboard. *(On TeleTables, we follow the original paper's evaluation protocol by attaching the table content directly to the prompt — a table-grounded reasoning setup rather than retrieval without table id or content.)*
 **In one line: TelecomGPT-R1 demonstrates that an open 27B telecom reasoning model can reach SOTA performance across the full breadth of the GSMA Open Telco Leaderboard.**
@@ -23,6 +23,82 @@ license: apache-2.0
 **Figure 1 | TelecomGPT-R1 vs frontier closed-source models on the GSMA Open Telco Leaderboard.** *Each spoke is one benchmark (plus the overall average), normalized by its per-axis leaderboard best so that `1.0` = best score on that benchmark. Our 27B open-source policy reaches `1.0` on **five of eight axes** (3GPP-TSG, srsRANBench, TeleLogs, TeleTables, Average) and stays at or above `0.95` on every other axis, visibly tracing the outer edge of the radar where no other model, open or closed, matches it on all axes simultaneously.*
 ---
@@ -130,71 +206,6 @@ KU/DFI's role is to build that open commons. The program now spans the key layer
 - **Model weights.** [KU-DFI/TelecomGPT-R1](https://huggingface.co/KU-DFI/TelecomGPT-R1/tree/main)
 - **Unified benchmark.** [GSMA Open Telco Leaderboard](https://huggingface.co/spaces/GSMA/open-telco-leaderboard)
-### Quickstart
-Here is a code snippet demonstrating how to load TelecomGPT-R1 with `transformers` and generate a telecom-grounded response:
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "KU-DFI/TelecomGPT-R1"
-model = AutoModelForCausalLM.from_pretrained(
-    model_name,
-    torch_dtype="auto",
-    device_map="auto",
-)
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-prompt = (
-    "A 5G NR cell is observing repeated random-access failures from cell-edge UEs. "
-    "Drive-test capture shows: average RSRP = -108 dBm, average RSRQ = -16 dB, "
-    "PRACH preamble attempts averaging 8 with no Msg2 (RAR) received within "
-    "ra-ResponseWindow, UE timing-advance range 4-7 km, and PRACH configuration "
-    "uses preamble format A1 with zeroCorrelationZoneConfig = 8. "
-    "Diagnose the most likely root cause and propose a configuration change."
-)
-messages = [
-    {
-        "role": "system",
-        "content": (
-            "You are TelecomGPT-R1, an open 27B telecom reasoning model from "
-            "KU/DFI. Reason step-by-step over 3GPP standards, RAN logs, RF and "
-            "network derivations, and telecom code."
-        ),
-    },
-    {"role": "user", "content": prompt},
-]
-text = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    add_generation_prompt=True,
-)
-model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
-generated_ids = model.generate(
-    **model_inputs,
-    max_new_tokens=2048,
-)
-generated_ids = [
-    output_ids[len(input_ids):]
-    for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
-]
-response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
-print(response)
-```
-For production / batch serving on operator-confidential data, host with [vLLM](https://github.com/vllm-project/vllm):
-```bash
-vllm serve KU-DFI/TelecomGPT-R1 \
-    --tensor-parallel-size 4 \
-    --max-model-len 32768 \
-    --gpu-memory-utilization 0.90
-```
-**Hardware**: TelecomGPT-R1 (27B, bf16) fits on a single H100 80GB or MI300X; for high-throughput inference behind an operator firewall a single H100/MI300 node serves the model end-to-end.
 ### Citation
@@ -220,3 +231,7 @@ vllm serve KU-DFI/TelecomGPT-R1 \
 ### Acknowledgements
 This work was supported by the Digital Future Institute of Khalifa University; the College of Information Science and Electronic Engineering, Zhejiang University; the College of Computer Science and Technology, Zhejiang University; and the Research Computing team of Khalifa University.

 - **Among open-source models**, TelecomGPT-R1 leads DeepSeek-V3-0324 (685B) by **+30.3**, LLaMA-3.3-70B by **+34.9**, and Qwen2.5-72B by **+35.6**, while operating at roughly **25× fewer active parameters than the next-best open entrant**.
 - **Among closed-source models**, TelecomGPT-R1 reaches SOTA performance across both the general-purpose frontier tier and the telecom-specialized tier, as detailed in the two bullets below.
 - **Among general-purpose frontier models**, TelecomGPT-R1 leads Gemini-3.1-Pro by **+14.0**, Claude-Opus-4.6 by **+16.3**, and GPT-5 by **+17.7**. These systems sit at the **trillion-parameter-class frontier** (active-parameter counts are not publicly disclosed but are widely reported as orders of magnitude larger than 27B), making the margin a parameter-efficiency result as much as an accuracy result.
+- **Among telecom-specialized models**, TelecomGPT-R1 is **on par with the leading closed operator-internal telecom model AT&T's OTel-LLM-8.3B-QnA**, and leads SoftBank LTM by **+16.0**, demonstrating that an open telecom reasoning model can reach SOTA performance alongside top operator-internal baselines on the GSMA Open Telco Leaderboard.
 **In one line: TelecomGPT-R1 demonstrates that an open 27B telecom reasoning model can reach SOTA performance across the full breadth of the GSMA Open Telco Leaderboard.**
 **Figure 1 | TelecomGPT-R1 vs frontier closed-source models on the GSMA Open Telco Leaderboard.** *Each spoke is one benchmark (plus the overall average), normalized by its per-axis leaderboard best so that `1.0` = best score on that benchmark. Our 27B open-source policy reaches `1.0` on **five of eight axes** (3GPP-TSG, srsRANBench, TeleLogs, TeleTables, Average) and stays at or above `0.95` on every other axis, visibly tracing the outer edge of the radar where no other model, open or closed, matches it on all axes simultaneously.*
+---
+### Quickstart
+**Requirements:** `transformers >= 4.51.0`, `torch >= 2.1`.
+Here is a code snippet demonstrating how to load TelecomGPT-R1 with `transformers` and generate a telecom-grounded response:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "KU-DFI/TelecomGPT-R1"
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+prompt = (
+    "A 5G NR cell is observing repeated random-access failures from cell-edge UEs. "
+    "Drive-test capture shows: average RSRP = -108 dBm, average RSRQ = -16 dB, "
+    "PRACH preamble attempts averaging 8 with no Msg2 (RAR) received within "
+    "ra-ResponseWindow, UE timing-advance range 4-7 km, and PRACH configuration "
+    "uses preamble format A1 with zeroCorrelationZoneConfig = 8. "
+    "Diagnose the most likely root cause and propose a configuration change."
+)
+messages = [
+    {
+        "role": "system",
+        "content": (
+            "You are TelecomGPT-R1, an open 27B telecom reasoning model from "
+            "KU/DFI. Reason step-by-step over 3GPP standards, RAN logs, RF and "
+            "network derivations, and telecom code."
+        ),
+    },
+    {"role": "user", "content": prompt},
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True,
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=2048,
+)
+generated_ids = [
+    output_ids[len(input_ids):]
+    for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+print(response)
+```
+For production / batch serving on operator-confidential data, host with [vLLM](https://github.com/vllm-project/vllm):
+```bash
+vllm serve KU-DFI/TelecomGPT-R1 \
+    --tensor-parallel-size 1 \
+    --max-model-len 8192 \
+    --gpu-memory-utilization 0.85
+```
+(Scale `--tensor-parallel-size`, `--max-model-len`, and `--gpu-memory-utilization` up as needed for multi-GPU nodes or higher-throughput serving.)
+**Hardware**: TelecomGPT-R1 (27B, bf16) fits on a single H100 80GB or MI300X with the default settings above; multi-GPU nodes allow longer contexts and larger batches behind an operator firewall.
 ---
 - **Model weights.** [KU-DFI/TelecomGPT-R1](https://huggingface.co/KU-DFI/TelecomGPT-R1/tree/main)
 - **Unified benchmark.** [GSMA Open Telco Leaderboard](https://huggingface.co/spaces/GSMA/open-telco-leaderboard)
 ### Citation
 ### Acknowledgements
 This work was supported by the Digital Future Institute of Khalifa University; the College of Information Science and Electronic Engineering, Zhejiang University; the College of Computer Science and Technology, Zhejiang University; and the Research Computing team of Khalifa University.
+---
+[^teletables]: On TeleTables, we follow the original paper's evaluation protocol by attaching the table content directly to the prompt — a table-grounded reasoning setup rather than retrieval without table id or content.