Safetensors
qwen3_5
wbhVince829 commited on
Commit
8064345
·
1 Parent(s): 0974402

more diplomatic

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -3,20 +3,20 @@ license: apache-2.0
3
  ---
4
  # TelecomGPT-R1: The Best Telecom-Specific Large Language Model
5
 
6
- > A 27B open model that ranks **#1 on the GSMA Open Telco Leaderboard** across **all 86 evaluated models** (open or closed, general-purpose or operator-specialized), with an average score of **89.0%**, ahead of every other model on the board.
7
 
8
  ---
9
 
10
  ## 1 — A New State of the Art for Telecom LLMs
11
 
12
- **TelecomGPT-R1 (27B) ranks #1 on the [GSMA Open Telco Leaderboard](https://huggingface.co/spaces/GSMA/open-telco-leaderboard) at 89.0% average, leading every open-source and closed-source entrant across both general-purpose and operator-specialized categories.** The leaderboard aggregates 7 benchmarks spanning 4 evaluation axes (telecom knowledge QA, 3GPP protocol comprehension, fault and log diagnosis, and RF/network modeling), as reported in Figure 1.
13
 
14
- - **Among open-source models**, TelecomGPT-R1 leads DeepSeek-V3-0324 (685B) by **+29.7**, LLaMA-3.3-70B by **+34.3**, and Qwen2.5-72B by **+35.0**, while operating at roughly **25× fewer active parameters than the next-best open entrant**.
15
- - **Among closed-source models**, TelecomGPT-R1 leads both the general-purpose frontier tier and the operator-specialized tier, as detailed in the two bullets below.
16
- - **Among general-purpose frontier models**, TelecomGPT-R1 leads Gemini-3.1-Pro by **+13.4**, Claude-Opus-4.6 by **+15.7**, and GPT-5 by **+17.1**. These systems sit at the **trillion-parameter-class frontier** (active-parameter counts are not publicly disclosed but are widely reported as orders of magnitude larger than 27B), making the margin a parameter-efficiency result as much as an accuracy result.
17
- - **Among operator-specialized telecom models**, TelecomGPT-R1 leads AT&T OTel-LLM-8.3B-QnA by **+3.0** (and OTel-LLM is narrow-task trained) and SoftBank LTM by **+15.4** the **first model, open or closed, to outscore an operator-internal telecom baseline** on the GSMA Open Telco Leaderboard.
18
 
19
- **In one line: a 27B open specialist beats both trillion-parameter-class generalists and operator-locked verticals on the same public benchmark suite.**
20
 
21
 
22
  ![Figure 1. TelecomGPT-R1 vs frontier closed-source models on the GSMA Open Telco Leaderboard](https://cdn-uploads.huggingface.co/production/uploads/6882f57510e86d9f80580702/1jpJq-UoSFK1GYhwjLA9y.png)
 
3
  ---
4
  # TelecomGPT-R1: The Best Telecom-Specific Large Language Model
5
 
6
+ > A 27B open model that reaches **SOTA on the GSMA Open Telco Leaderboard** across **all models** (open or closed, general-purpose or telecom-specialized), with an average score of **89.6%**, demonstrating that an open telecom reasoning model can match top performance on telecom benchmarks.
7
 
8
  ---
9
 
10
  ## 1 — A New State of the Art for Telecom LLMs
11
 
12
+ **TelecomGPT-R1 (27B) reaches state-of-the-art (SOTA) performance on the [GSMA Open Telco Leaderboard](https://huggingface.co/spaces/GSMA/open-telco-leaderboard) at 89.6% average, matching or leading every open-source and closed-source entrant across both general-purpose and telecom-specialized categories.** The leaderboard aggregates 7 benchmarks spanning 4 evaluation axes (telecom knowledge QA, 3GPP protocol comprehension, fault and log diagnosis, and RF/network modeling), as reported in Figure 1.
13
 
14
+ - **Among open-source models**, TelecomGPT-R1 leads DeepSeek-V3-0324 (685B) by **+30.3**, LLaMA-3.3-70B by **+34.9**, and Qwen2.5-72B by **+35.6**, while operating at roughly **25× fewer active parameters than the next-best open entrant**.
15
+ - **Among closed-source models**, TelecomGPT-R1 reaches SOTA performance across both the general-purpose frontier tier and the telecom-specialized tier, as detailed in the two bullets below.
16
+ - **Among general-purpose frontier models**, TelecomGPT-R1 leads Gemini-3.1-Pro by **+14.0**, Claude-Opus-4.6 by **+16.3**, and GPT-5 by **+17.7**. These systems sit at the **trillion-parameter-class frontier** (active-parameter counts are not publicly disclosed but are widely reported as orders of magnitude larger than 27B), making the margin a parameter-efficiency result as much as an accuracy result.
17
+ - **Among telecom-specialized models**, TelecomGPT-R1 is **on par with the leading closed operator-internal telecom model AT&T's OTel-LLM-8.3B-QnA**, and leads SoftBank LTM by **+16.0**, demonstrating that an open telecom reasoning model can reach SOTA performance alongside top operator-internal baselines on the GSMA Open Telco Leaderboard. *(On TeleTables, we follow the original paper's evaluation protocol by attaching the table content directly to the prompt — a table-grounded reasoning setup rather than retrieval without table id or content.)*
18
 
19
+ **In one line: TelecomGPT-R1 demonstrates that an open 27B telecom reasoning model can reach SOTA performance across the full breadth of the GSMA Open Telco Leaderboard.**
20
 
21
 
22
  ![Figure 1. TelecomGPT-R1 vs frontier closed-source models on the GSMA Open Telco Leaderboard](https://cdn-uploads.huggingface.co/production/uploads/6882f57510e86d9f80580702/1jpJq-UoSFK1GYhwjLA9y.png)