KU-DFI
/

TelecomGPT-R1

Safetensors

qwen3_5

Model card Files Files and versions

xet

Community

wbhVince829 commited on 28 days ago

Commit

8064345

1 Parent(s): 0974402

more diplomatic

Browse files

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -3,20 +3,20 @@ license: apache-2.0
 ---
 # TelecomGPT-R1: The Best Telecom-Specific Large Language Model
-> A 27B open model that ranks **#1 on the GSMA Open Telco Leaderboard** across **all 86 evaluated models** (open or closed, general-purpose or operator-specialized), with an average score of **89.0%**, ahead of every other model on the board.
 ---
 ## 1 — A New State of the Art for Telecom LLMs
-**TelecomGPT-R1 (27B) ranks #1 on the [GSMA Open Telco Leaderboard](https://huggingface.co/spaces/GSMA/open-telco-leaderboard) at 89.0% average, leading every open-source and closed-source entrant across both general-purpose and operator-specialized categories.** The leaderboard aggregates 7 benchmarks spanning 4 evaluation axes (telecom knowledge QA, 3GPP protocol comprehension, fault and log diagnosis, and RF/network modeling), as reported in Figure 1.
-- **Among open-source models**, TelecomGPT-R1 leads DeepSeek-V3-0324 (685B) by **+29.7**, LLaMA-3.3-70B by **+34.3**, and Qwen2.5-72B by **+35.0**, while operating at roughly **25× fewer active parameters than the next-best open entrant**.
-- **Among closed-source models**, TelecomGPT-R1 leads both the general-purpose frontier tier and the operator-specialized tier, as detailed in the two bullets below.
-- **Among general-purpose frontier models**, TelecomGPT-R1 leads Gemini-3.1-Pro by **+13.4**, Claude-Opus-4.6 by **+15.7**, and GPT-5 by **+17.1**. These systems sit at the **trillion-parameter-class frontier** (active-parameter counts are not publicly disclosed but are widely reported as orders of magnitude larger than 27B), making the margin a parameter-efficiency result as much as an accuracy result.
-- **Among operator-specialized telecom models**, TelecomGPT-R1 leads AT&T OTel-LLM-8.3B-QnA by **+3.0** (and OTel-LLM is narrow-task trained) and SoftBank LTM by **+15.4** — the **first model, open or closed, to outscore an operator-internal telecom baseline** on the GSMA Open Telco Leaderboard.
-**In one line: a 27B open specialist beats both trillion-parameter-class generalists and operator-locked verticals on the same public benchmark suite.**
 ![Figure 1. TelecomGPT-R1 vs frontier closed-source models on the GSMA Open Telco Leaderboard](https://cdn-uploads.huggingface.co/production/uploads/6882f57510e86d9f80580702/1jpJq-UoSFK1GYhwjLA9y.png)

 ---
 # TelecomGPT-R1: The Best Telecom-Specific Large Language Model
+> A 27B open model that reaches **SOTA on the GSMA Open Telco Leaderboard** across **all models** (open or closed, general-purpose or telecom-specialized), with an average score of **89.6%**, demonstrating that an open telecom reasoning model can match top performance on telecom benchmarks.
 ---
 ## 1 — A New State of the Art for Telecom LLMs
+**TelecomGPT-R1 (27B) reaches state-of-the-art (SOTA) performance on the [GSMA Open Telco Leaderboard](https://huggingface.co/spaces/GSMA/open-telco-leaderboard) at 89.6% average, matching or leading every open-source and closed-source entrant across both general-purpose and telecom-specialized categories.** The leaderboard aggregates 7 benchmarks spanning 4 evaluation axes (telecom knowledge QA, 3GPP protocol comprehension, fault and log diagnosis, and RF/network modeling), as reported in Figure 1.
+- **Among open-source models**, TelecomGPT-R1 leads DeepSeek-V3-0324 (685B) by **+30.3**, LLaMA-3.3-70B by **+34.9**, and Qwen2.5-72B by **+35.6**, while operating at roughly **25× fewer active parameters than the next-best open entrant**.
+- **Among closed-source models**, TelecomGPT-R1 reaches SOTA performance across both the general-purpose frontier tier and the telecom-specialized tier, as detailed in the two bullets below.
+- **Among general-purpose frontier models**, TelecomGPT-R1 leads Gemini-3.1-Pro by **+14.0**, Claude-Opus-4.6 by **+16.3**, and GPT-5 by **+17.7**. These systems sit at the **trillion-parameter-class frontier** (active-parameter counts are not publicly disclosed but are widely reported as orders of magnitude larger than 27B), making the margin a parameter-efficiency result as much as an accuracy result.
+- **Among telecom-specialized models**, TelecomGPT-R1 is **on par with the leading closed operator-internal telecom model AT&T's OTel-LLM-8.3B-QnA**, and leads SoftBank LTM by **+16.0**, demonstrating that an open telecom reasoning model can reach SOTA performance alongside top operator-internal baselines on the GSMA Open Telco Leaderboard. *(On TeleTables, we follow the original paper's evaluation protocol by attaching the table content directly to the prompt — a table-grounded reasoning setup rather than retrieval without table id or content.)*
+**In one line: TelecomGPT-R1 demonstrates that an open 27B telecom reasoning model can reach SOTA performance across the full breadth of the GSMA Open Telco Leaderboard.**
 ![Figure 1. TelecomGPT-R1 vs frontier closed-source models on the GSMA Open Telco Leaderboard](https://cdn-uploads.huggingface.co/production/uploads/6882f57510e86d9f80580702/1jpJq-UoSFK1GYhwjLA9y.png)