dqy08 commited on
Commit
c6aeab1
·
1 Parent(s): cbb60f2

更新Dockerfile以使用Qwen3-0.6B模型;更新README和HTML文件;更新示例数据以匹配新模型。

Browse files
Dockerfile CHANGED
@@ -55,4 +55,6 @@ ENV FORCE_INT8=1
55
 
56
  EXPOSE 7860
57
 
58
- CMD ["python", "server.py", "--model", "qwen3.0-14b", "--address", "0.0.0.0", "--port", "7860"]
 
 
 
55
 
56
  EXPOSE 7860
57
 
58
+ # CMD ["python", "server.py", "--model", "qwen3.0-14b", "--address", "0.0.0.0", "--port", "7860"]
59
+ CMD ["python", "server.py", "--model", "qwen3.0-0.6b", "--address", "0.0.0.0", "--port", "7860"]
60
+ ENV FORCE_INT8=0
README.md CHANGED
@@ -1,9 +1,10 @@
1
  ---
2
- title: InfoRadar
3
  emoji: 📡
4
  colorFrom: blue
5
- colorTo: indigo
6
  sdk: docker
 
7
  app_port: 7860
8
  pinned: false
9
  license: apache-2.0
@@ -13,24 +14,21 @@ license: apache-2.0
13
 
14
  # InfoRadar (Information Radar)
15
 
16
- **InfoRadar** is a visual tool for **analyzing text information density and efficiency**.
17
 
18
- Unlike traditional "AI detectors" that simply classify text as "Human vs. AI", InfoRadar focuses on evaluating the **quality of information**. By visualizing "surprisal" (information content), it intuitively reveals information flow patterns, helping to identify low-information "nonsense" (whether AI hallucinations or human verbosity) and highlighting high-density core information.
19
 
20
  ## 🚀 Core Features
21
 
22
- - **Information Density Visualization**: Color-coded analysis based on token-level surprisal (`-log p`).
23
- - ⚪ **Transparent**: High predictability (Low information / Common phrases / "Filler")
24
- - 🔴 **Red**: High information content (Surprising / Specific / "Core Content")
25
 
26
  ## 💡 Tribute
27
 
28
  InfoRadar is engineered based on the classic project [GLTR.io](http://gltr.io) developed by Hendrik Strobelt et al. in 2019. GLTR was a web demo that pioneered the use of GPT-2 prediction probabilities to detect generated text.
29
 
30
- The difference lies in the goal of this project: **not to "detect AI text", but to "evaluate text quality"**:
31
-
32
- 1. **From "Detection" to "Evaluation"**: Shifting focus from "Is this written by AI?" to "Is this content efficient and valuable?"
33
- 2. **Information Theoretic Perspective**: Introducing cognitive linguistics concepts (such as Surprisal Theory, UID) to measure text quality from first principles.
34
 
35
  ## 📦 Quick Start
36
 
 
1
  ---
2
+ title: InfoRadar – Visualize Text Information Density
3
  emoji: 📡
4
  colorFrom: blue
5
+ colorTo: red
6
  sdk: docker
7
+ short_description: analyzes text to visualize token-level information density
8
  app_port: 7860
9
  pinned: false
10
  license: apache-2.0
 
14
 
15
  # InfoRadar (Information Radar)
16
 
17
+ Tired of low-quality articles? Struggling to find key points in long texts? Want to skip redundancy and fluff at a glance? Or just curious about the information-theoretic nature of language?
18
 
19
+ **Try InfoRadar.** It uses large language models to analyze text information density and visualizes where the important parts are. The color intensity of each character indicates how much information it carries.
20
 
21
  ## 🚀 Core Features
22
 
23
+ - **Information Density Visualization**: Color-coded analysis based on token-level surprisal (`-log p`).
24
+ - ⚪ **Transparent**: High predictability (low information / common phrases / filler)
25
+ - 🔴 **Red**: High information content (surprising / specific / core content)
26
 
27
  ## 💡 Tribute
28
 
29
  InfoRadar is engineered based on the classic project [GLTR.io](http://gltr.io) developed by Hendrik Strobelt et al. in 2019. GLTR was a web demo that pioneered the use of GPT-2 prediction probabilities to detect generated text.
30
 
31
+ The difference lies in the goal: **not to "detect AI text", but to "evaluate text quality"**. When we dislike AI text, we actually dislike low-quality text; the key is information quality. InfoRadar focuses on "information quality" rather than "AI signs", though it can help spot AI-generated nonsense with no information content. Currently **Qwen3-14B-Base** is used for analysis.
 
 
 
32
 
33
  ## 📦 Quick Start
34
 
README.zh-CN.md CHANGED
@@ -1,13 +1,3 @@
1
- ---
2
- title: InfoRadar
3
- emoji: 📡
4
- colorFrom: blue
5
- colorTo: indigo
6
- sdk: docker
7
- app_port: 7860
8
- license: apache-2.0
9
- ---
10
-
11
  **[English](README.md)** | 简体中文
12
 
13
  # InfoRadar (信息雷达)
 
 
 
 
 
 
 
 
 
 
 
1
  **[English](README.md)** | 简体中文
2
 
3
  # InfoRadar (信息雷达)
backend/language_checker.py CHANGED
@@ -120,11 +120,11 @@ class AbstractLanguageChecker:
120
  获取计算设备
121
 
122
  优先级:
123
- 1. 显式强制 CPU(FORCE_CPU 环境变量)
124
  2. 自动检测最优设备(cuda > mps > cpu)
125
  """
126
  # 如果显式要求 CPU,直接返回(唯一有意义的强制场景)
127
- if os.environ.get('FORCE_CPU'):
128
  return torch.device("cpu")
129
 
130
  # 自动选择最优设备
@@ -210,10 +210,10 @@ class QwenLM(AbstractLanguageChecker):
210
  load_description = "模型"
211
 
212
  # 环境变量配置
213
- # FORCE_INT8: 启用 INT8 量化(适用于 CPU 和 CUDA,实验性,在某些情况下会降低性能)
214
- # CPU_FORCE_BFLOAT16: 启用 bfloat16(仅适用于 CPU,需硬件加速支持,否则会降低性能)
215
- force_int8 = os.environ.get('FORCE_INT8')
216
- force_bfloat16 = os.environ.get('CPU_FORCE_BFLOAT16')
217
 
218
  # 检测是否为 AWQ 模型(自动检测)
219
  is_awq_model = self._is_awq_model(model_path)
@@ -239,12 +239,12 @@ class QwenLM(AbstractLanguageChecker):
239
  use_int8 = True
240
  device_map = "cpu"
241
  load_description = "模型(INT8量化)"
242
- print("⚠️ 启用 INT8 量化(实验性,在某些情况下会降低性能)")
243
 
244
  elif force_bfloat16:
245
  dtype = torch.bfloat16
246
  use_low_cpu_mem = True
247
- print("⚠️ 启用 bfloat16(需硬件加速支持,否则会降低性能)")
248
 
249
  else:
250
  # 默认: float32
@@ -260,7 +260,7 @@ class QwenLM(AbstractLanguageChecker):
260
  if force_int8:
261
  use_int8 = True
262
  load_description = "模型(INT8量化)"
263
- print("⚠️ 启用 INT8 量化")
264
  else:
265
  dtype = torch.float16
266
  print("🔧 dtype: float16")
@@ -271,7 +271,7 @@ class QwenLM(AbstractLanguageChecker):
271
  print(f"🔧 {self.device.type.upper()} 模式:自动设备分配")
272
 
273
  if force_int8:
274
- print("⚠️ MPS 不支持 INT8 量化,已忽略 FORCE_INT8 环境变量")
275
 
276
  device_map = "auto"
277
  dtype = torch.float16
@@ -351,6 +351,9 @@ class QwenLM(AbstractLanguageChecker):
351
 
352
  device_name = DeviceManager.get_device_name(self.device)
353
  print(f"✓ {model_display_name} 模型已加载 ({device_name})")
 
 
 
354
 
355
  def _load_model_with_int8_cuda(
356
  self,
@@ -787,8 +790,11 @@ class QwenLM(AbstractLanguageChecker):
787
  DeviceManager.clear_cache(self.device)
788
  gc.collect()
789
 
790
- # 打印分析任务完成后的内存统
791
- if self.device.type == "cuda":
 
 
 
792
  device_idx = self.device.index if self.device.index is not None else 0
793
  DeviceManager.print_cuda_memory_summary(device=device_idx)
794
 
 
120
  获取计算设备
121
 
122
  优先级:
123
+ 1. 显式强制 CPU(FORCE_CPU=1 环境变量)
124
  2. 自动检测最优设备(cuda > mps > cpu)
125
  """
126
  # 如果显式要求 CPU,直接返回(唯一有意义的强制场景)
127
+ if os.environ.get('FORCE_CPU') == '1':
128
  return torch.device("cpu")
129
 
130
  # 自动选择最优设备
 
210
  load_description = "模型"
211
 
212
  # 环境变量配置
213
+ # FORCE_INT8=1: 启用 INT8 量化(适用于 CPU 和 CUDA,实验性,在某些情况下会降低性能)
214
+ # CPU_FORCE_BFLOAT16=1: 启用 bfloat16(仅适用于 CPU,需硬件加速支持,否则会降低性能)
215
+ force_int8 = os.environ.get('FORCE_INT8') == '1'
216
+ force_bfloat16 = os.environ.get('CPU_FORCE_BFLOAT16') == '1'
217
 
218
  # 检测是否为 AWQ 模型(自动检测)
219
  is_awq_model = self._is_awq_model(model_path)
 
239
  use_int8 = True
240
  device_map = "cpu"
241
  load_description = "模型(INT8量化)"
242
+ print("⚠️ 启用 INT8 量化(FORCE_INT8=1,实验性,在某些情况下会降低性能)")
243
 
244
  elif force_bfloat16:
245
  dtype = torch.bfloat16
246
  use_low_cpu_mem = True
247
+ print("⚠️ 启用 bfloat16(CPU_FORCE_BFLOAT16=1,需硬件加速支持,否则会降低性能)")
248
 
249
  else:
250
  # 默认: float32
 
260
  if force_int8:
261
  use_int8 = True
262
  load_description = "模型(INT8量化)"
263
+ print("⚠️ 启用 INT8 量化(FORCE_INT8=1)")
264
  else:
265
  dtype = torch.float16
266
  print("🔧 dtype: float16")
 
271
  print(f"🔧 {self.device.type.upper()} 模式:自动设备分配")
272
 
273
  if force_int8:
274
+ print("⚠️ MPS 不支持 INT8 量化,已忽略 FORCE_INT8=1 环境变量")
275
 
276
  device_map = "auto"
277
  dtype = torch.float16
 
351
 
352
  device_name = DeviceManager.get_device_name(self.device)
353
  print(f"✓ {model_display_name} 模型已加载 ({device_name})")
354
+
355
+ # 初始化分析计数器(用于控制GPU内存统计打印频率)
356
+ self._analysis_count = 0
357
 
358
  def _load_model_with_int8_cuda(
359
  self,
 
790
  DeviceManager.clear_cache(self.device)
791
  gc.collect()
792
 
793
+ # 更新分析计数器
794
+ self._analysis_count += 1
795
+
796
+ # 打印分析任务完成后的内存统计(第1、11、21...次分析后打印)
797
+ if self.device.type == "cuda" and (self._analysis_count - 1) % 10 == 0:
798
  device_idx = self.device.index if self.device.index is not None else 0
799
  DeviceManager.print_cuda_memory_summary(device=device_idx)
800
 
backend/runtime_config.py CHANGED
@@ -116,7 +116,7 @@ def detect_platform(verbose: bool = True) -> str:
116
  平台 ID 字符串(如 'local_mps', 'cloud_cuda', 'cloud_cpu_16g', 'default_cpu_machine')
117
  """
118
  # 1. 显式强制 CPU
119
- if os.environ.get("FORCE_CPU"):
120
  print(f"🔧 强制 CPU 模式")
121
  return _detect_cpu_variant()
122
 
 
116
  平台 ID 字符串(如 'local_mps', 'cloud_cuda', 'cloud_cpu_16g', 'default_cpu_machine')
117
  """
118
  # 1. 显式强制 CPU
119
+ if os.environ.get("FORCE_CPU") == "1":
120
  print(f"🔧 强制 CPU 模式")
121
  return _detect_cpu_variant()
122
 
client/src/content/home.en.html CHANGED
@@ -75,8 +75,9 @@
75
  AI-generated nonsense with no information content.</p>
76
 
77
  <p><strong>What LLM is currently used?</strong></p>
78
- <p>Currently <strong>Qwen3-14B-Base</strong> is used, which gives pretty good results among the
79
- models the author has tested.</p>
 
80
 
81
  <p><strong>Why does information content affect text quality?</strong></p>
82
  <p>Low information content means the LLM can easily predict it from context. If even a machine can predict it,
 
75
  AI-generated nonsense with no information content.</p>
76
 
77
  <p><strong>What LLM is currently used?</strong></p>
78
+ <p>Currently the open-source <strong>Qwen3-14B-Base</strong> is used, which gives pretty good results among the
79
+ models the author has tested. When lack of hardware credits, <strong>Qwen3-0.6B-Base</strong> is used
80
+ instead; it's smaller, faster, and performs slightly worse than Qwen3-14B-Base (about 30%).</p>
81
 
82
  <p><strong>Why does information content affect text quality?</strong></p>
83
  <p>Low information content means the LLM can easily predict it from context. If even a machine can predict it,
client/src/content/home.zh.html CHANGED
@@ -53,7 +53,7 @@
53
  </p>
54
 
55
  <p><strong>目前使用的是什么大模型?</strong></p>
56
- <p>当前使用的是开源的 <strong>Qwen3-14B-Base</strong>,它是作者测试过的模型里结果挺不错的一个。</p>
57
 
58
  <p><strong>说到底,为什么信息量会影响文本的质量?</strong></p>
59
  <p>一个词的信息量低,意味着大模型能很容易从上文预测出来。既然机器都能预测出来,那它还能有多关键呢?反之,一个词的信息量高,意味着大模型很难从上文预测出来。(如果不是错误表达的话)那它就代表了作者想要表达,而机器不知道的关键信息。
 
53
  </p>
54
 
55
  <p><strong>目前使用的是什么大模型?</strong></p>
56
+ <p>当前使用的是开源的 <strong>Qwen3-14B-Base</strong>,它是作者测试过的模型里结果挺不错的一个。当硬件额度不足时,会用Qwen3-0.6B-Base模型,它体积小,速度快,效果比Qwen3-14B-Base稍差(30%左右)。</p>
57
 
58
  <p><strong>说到底,为什么信息量会影响文本的质量?</strong></p>
59
  <p>一个词的信息量低,意味着大模型能很容易从上文预测出来。既然机器都能预测出来,那它还能有多关键呢?反之,一个词的信息量高,意味着大模型很难从上文预测出来。(如果不是错误表达的话)那它就代表了作者想要表达,而机器不知道的关键信息。
client/src/index.html CHANGED
@@ -3,7 +3,9 @@
3
 
4
  <head>
5
  <meta charset="UTF-8">
6
- <title>InfoRadar(信息雷达)</title>
 
 
7
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
8
  <link rel="stylesheet" type="text/css" href="start.css">
9
  </head>
 
3
 
4
  <head>
5
  <meta charset="UTF-8">
6
+ <title>InfoRadar — Analyze Text Information Density</title>
7
+ <meta name="description"
8
+ content="InfoRadar visualizes token-level information density in text using LLMs, helping you quickly find key content and skip redundancy.">
9
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
10
  <link rel="stylesheet" type="text/css" href="start.css">
11
  </head>
data/demo/public/CN/GPT-2 large unicorn text(中文翻译).json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/CN/GPT-2 small top_k 40 temp .7 (中文翻译).json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/CN/GPT-2 small top_k 5 temp 1 (中文翻译).json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/CN/human_ NYTimes article (中文翻译).json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/CN/human_ academic text (中文翻译).json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/GPT-2 large unicorn text ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/GPT-2 small top_k 5 temp 1.json ADDED
@@ -0,0 +1,911 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "request": {
3
+ "text": "GPT-2 small top_k 5 temp 1_qwen2.5"
4
+ },
5
+ "result": {
6
+ "model": "qwen3.0-14b",
7
+ "bpe_strings": [
8
+ {
9
+ "offset": [
10
+ 0,
11
+ 1
12
+ ],
13
+ "raw": "G",
14
+ "real_topk": [
15
+ 0,
16
+ 0.00034737586975097656
17
+ ],
18
+ "pred_topk": [
19
+ [
20
+ "Human",
21
+ 0.31103515625
22
+ ],
23
+ [
24
+ "What",
25
+ 0.09051513671875
26
+ ],
27
+ [
28
+ "import",
29
+ 0.0360107421875
30
+ ],
31
+ [
32
+ "How",
33
+ 0.033294677734375
34
+ ],
35
+ [
36
+ "#",
37
+ 0.0271759033203125
38
+ ],
39
+ [
40
+ "**",
41
+ 0.0267486572265625
42
+ ],
43
+ [
44
+ "package",
45
+ 0.0237884521484375
46
+ ],
47
+ [
48
+ "Question",
49
+ 0.0176849365234375
50
+ ],
51
+ [
52
+ "标题",
53
+ 0.014892578125
54
+ ],
55
+ [
56
+ "The",
57
+ 0.01324462890625
58
+ ]
59
+ ]
60
+ },
61
+ {
62
+ "offset": [
63
+ 1,
64
+ 3
65
+ ],
66
+ "raw": "PT",
67
+ "real_topk": [
68
+ 0,
69
+ 0.0007691383361816406
70
+ ],
71
+ "pred_topk": [
72
+ [
73
+ " =",
74
+ 0.14892578125
75
+ ],
76
+ [
77
+ "ing",
78
+ 0.12152099609375
79
+ ],
80
+ [
81
+ ":",
82
+ 0.067626953125
83
+ ],
84
+ [
85
+ "2",
86
+ 0.030487060546875
87
+ ],
88
+ [
89
+ ".",
90
+ 0.0248870849609375
91
+ ],
92
+ [
93
+ "0",
94
+ 0.02301025390625
95
+ ],
96
+ [
97
+ "1",
98
+ 0.018341064453125
99
+ ],
100
+ [
101
+ "4",
102
+ 0.0172271728515625
103
+ ],
104
+ [
105
+ "and",
106
+ 0.0091552734375
107
+ ],
108
+ [
109
+ "3",
110
+ 0.00724029541015625
111
+ ]
112
+ ]
113
+ },
114
+ {
115
+ "offset": [
116
+ 3,
117
+ 4
118
+ ],
119
+ "raw": "-",
120
+ "real_topk": [
121
+ 0,
122
+ 0.274169921875
123
+ ],
124
+ "pred_topk": [
125
+ [
126
+ "-",
127
+ 0.274169921875
128
+ ],
129
+ [
130
+ "4",
131
+ 0.0726318359375
132
+ ],
133
+ [
134
+ "模型",
135
+ 0.0626220703125
136
+ ],
137
+ [
138
+ ":",
139
+ 0.030517578125
140
+ ],
141
+ [
142
+ ",",
143
+ 0.0251007080078125
144
+ ],
145
+ [
146
+ "语言",
147
+ 0.0164642333984375
148
+ ],
149
+ [
150
+ " ",
151
+ 0.01546478271484375
152
+ ],
153
+ [
154
+ "3",
155
+ 0.0140838623046875
156
+ ],
157
+ [
158
+ "2",
159
+ 0.012237548828125
160
+ ],
161
+ [
162
+ ",",
163
+ 0.0115814208984375
164
+ ]
165
+ ]
166
+ },
167
+ {
168
+ "offset": [
169
+ 4,
170
+ 5
171
+ ],
172
+ "raw": "2",
173
+ "real_topk": [
174
+ 0,
175
+ 0.07470703125
176
+ ],
177
+ "pred_topk": [
178
+ [
179
+ "4",
180
+ 0.53515625
181
+ ],
182
+ [
183
+ "3",
184
+ 0.291015625
185
+ ],
186
+ [
187
+ "2",
188
+ 0.07470703125
189
+ ],
190
+ [
191
+ "5",
192
+ 0.027069091796875
193
+ ],
194
+ [
195
+ "So",
196
+ 0.01541900634765625
197
+ ],
198
+ [
199
+ "1",
200
+ 0.01471710205078125
201
+ ],
202
+ [
203
+ "6",
204
+ 0.0089263916015625
205
+ ],
206
+ [
207
+ "Chat",
208
+ 0.00283050537109375
209
+ ],
210
+ [
211
+ "Neo",
212
+ 0.0017576217651367188
213
+ ],
214
+ [
215
+ "Style",
216
+ 0.0013055801391601562
217
+ ]
218
+ ]
219
+ },
220
+ {
221
+ "offset": [
222
+ 5,
223
+ 11
224
+ ],
225
+ "raw": " small",
226
+ "real_topk": [
227
+ 0,
228
+ 0.00011938810348510742
229
+ ],
230
+ "pred_topk": [
231
+ [
232
+ ":",
233
+ 0.09356689453125
234
+ ],
235
+ [
236
+ " is",
237
+ 0.054595947265625
238
+ ],
239
+ [
240
+ "模型",
241
+ 0.033111572265625
242
+ ],
243
+ [
244
+ "的",
245
+ 0.032073974609375
246
+ ],
247
+ [
248
+ ",",
249
+ 0.019927978515625
250
+ ],
251
+ [
252
+ " ",
253
+ 0.017852783203125
254
+ ],
255
+ [
256
+ "在",
257
+ 0.0177154541015625
258
+ ],
259
+ [
260
+ " and",
261
+ 0.0177154541015625
262
+ ],
263
+ [
264
+ "和",
265
+ 0.0174407958984375
266
+ ],
267
+ [
268
+ "-",
269
+ 0.01480865478515625
270
+ ]
271
+ ]
272
+ },
273
+ {
274
+ "offset": [
275
+ 11,
276
+ 15
277
+ ],
278
+ "raw": " top",
279
+ "real_topk": [
280
+ 0,
281
+ 0.00008863210678100586
282
+ ],
283
+ "pred_topk": [
284
+ [
285
+ " model",
286
+ 0.15966796875
287
+ ],
288
+ [
289
+ "模型",
290
+ 0.06201171875
291
+ ],
292
+ [
293
+ ",",
294
+ 0.0394287109375
295
+ ],
296
+ [
297
+ "\n",
298
+ 0.03704833984375
299
+ ],
300
+ [
301
+ " and",
302
+ 0.03094482421875
303
+ ],
304
+ [
305
+ " (",
306
+ 0.0279541015625
307
+ ],
308
+ [
309
+ ":",
310
+ 0.0264739990234375
311
+ ],
312
+ [
313
+ " vs",
314
+ 0.0231781005859375
315
+ ],
316
+ [
317
+ "和",
318
+ 0.021942138671875
319
+ ],
320
+ [
321
+ " is",
322
+ 0.020294189453125
323
+ ]
324
+ ]
325
+ },
326
+ {
327
+ "offset": [
328
+ 15,
329
+ 17
330
+ ],
331
+ "raw": "_k",
332
+ "real_topk": [
333
+ 0,
334
+ 0.00496673583984375
335
+ ],
336
+ "pred_topk": [
337
+ [
338
+ "-p",
339
+ 0.10369873046875
340
+ ],
341
+ [
342
+ " ",
343
+ 0.07769775390625
344
+ ],
345
+ [
346
+ "-",
347
+ 0.045318603515625
348
+ ],
349
+ [
350
+ "-level",
351
+ 0.03814697265625
352
+ ],
353
+ [
354
+ "-k",
355
+ 0.03668212890625
356
+ ],
357
+ [
358
+ " layer",
359
+ 0.033416748046875
360
+ ],
361
+ [
362
+ " model",
363
+ 0.0285797119140625
364
+ ],
365
+ [
366
+ "-per",
367
+ 0.0193328857421875
368
+ ],
369
+ [
370
+ " level",
371
+ 0.0193328857421875
372
+ ],
373
+ [
374
+ "\n",
375
+ 0.0190277099609375
376
+ ]
377
+ ]
378
+ },
379
+ {
380
+ "offset": [
381
+ 17,
382
+ 18
383
+ ],
384
+ "raw": " ",
385
+ "real_topk": [
386
+ 0,
387
+ 0.10272216796875
388
+ ],
389
+ "pred_topk": [
390
+ [
391
+ "=",
392
+ 0.11822509765625
393
+ ],
394
+ [
395
+ " ",
396
+ 0.10272216796875
397
+ ],
398
+ [
399
+ " top",
400
+ 0.09429931640625
401
+ ],
402
+ [
403
+ " and",
404
+ 0.0858154296875
405
+ ],
406
+ [
407
+ "和",
408
+ 0.047027587890625
409
+ ],
410
+ [
411
+ ",",
412
+ 0.040863037109375
413
+ ],
414
+ [
415
+ " =",
416
+ 0.037200927734375
417
+ ],
418
+ [
419
+ ":",
420
+ 0.024017333984375
421
+ ],
422
+ [
423
+ " is",
424
+ 0.0184173583984375
425
+ ],
426
+ [
427
+ "\n",
428
+ 0.0179901123046875
429
+ ]
430
+ ]
431
+ },
432
+ {
433
+ "offset": [
434
+ 18,
435
+ 19
436
+ ],
437
+ "raw": "5",
438
+ "real_topk": [
439
+ 0,
440
+ 0.200927734375
441
+ ],
442
+ "pred_topk": [
443
+ [
444
+ "4",
445
+ 0.274658203125
446
+ ],
447
+ [
448
+ "5",
449
+ 0.200927734375
450
+ ],
451
+ [
452
+ "1",
453
+ 0.1978759765625
454
+ ],
455
+ [
456
+ "2",
457
+ 0.114501953125
458
+ ],
459
+ [
460
+ "0",
461
+ 0.0799560546875
462
+ ],
463
+ [
464
+ "3",
465
+ 0.0236358642578125
466
+ ],
467
+ [
468
+ "8",
469
+ 0.0159912109375
470
+ ],
471
+ [
472
+ "6",
473
+ 0.01389312744140625
474
+ ],
475
+ [
476
+ " top",
477
+ 0.00569915771484375
478
+ ],
479
+ [
480
+ "7",
481
+ 0.0053558349609375
482
+ ]
483
+ ]
484
+ },
485
+ {
486
+ "offset": [
487
+ 19,
488
+ 24
489
+ ],
490
+ "raw": " temp",
491
+ "real_topk": [
492
+ 0,
493
+ 0.0005908012390136719
494
+ ],
495
+ "pred_topk": [
496
+ [
497
+ "0",
498
+ 0.734375
499
+ ],
500
+ [
501
+ " top",
502
+ 0.0963134765625
503
+ ],
504
+ [
505
+ ",",
506
+ 0.0251312255859375
507
+ ],
508
+ [
509
+ " ",
510
+ 0.01015472412109375
511
+ ],
512
+ [
513
+ " and",
514
+ 0.00815582275390625
515
+ ],
516
+ [
517
+ "\n",
518
+ 0.006870269775390625
519
+ ],
520
+ [
521
+ "1",
522
+ 0.004795074462890625
523
+ ],
524
+ [
525
+ ",",
526
+ 0.004299163818359375
527
+ ],
528
+ [
529
+ " 和",
530
+ 0.0035648345947265625
531
+ ],
532
+ [
533
+ "和",
534
+ 0.0032444000244140625
535
+ ]
536
+ ]
537
+ },
538
+ {
539
+ "offset": [
540
+ 24,
541
+ 25
542
+ ],
543
+ "raw": " ",
544
+ "real_topk": [
545
+ 0,
546
+ 0.9365234375
547
+ ],
548
+ "pred_topk": [
549
+ [
550
+ " ",
551
+ 0.9365234375
552
+ ],
553
+ [
554
+ "=",
555
+ 0.00785064697265625
556
+ ],
557
+ [
558
+ "0",
559
+ 0.006931304931640625
560
+ ],
561
+ [
562
+ " .",
563
+ 0.0048370361328125
564
+ ],
565
+ [
566
+ ".",
567
+ 0.0025501251220703125
568
+ ],
569
+ [
570
+ "1",
571
+ 0.0025501251220703125
572
+ ],
573
+ [
574
+ ":",
575
+ 0.002216339111328125
576
+ ],
577
+ [
578
+ "_",
579
+ 0.0018949508666992188
580
+ ],
581
+ [
582
+ " =",
583
+ 0.001865386962890625
584
+ ],
585
+ [
586
+ "erture",
587
+ 0.0013437271118164062
588
+ ]
589
+ ]
590
+ },
591
+ {
592
+ "offset": [
593
+ 25,
594
+ 26
595
+ ],
596
+ "raw": "1",
597
+ "real_topk": [
598
+ 0,
599
+ 0.163818359375
600
+ ],
601
+ "pred_topk": [
602
+ [
603
+ "0",
604
+ 0.80615234375
605
+ ],
606
+ [
607
+ "1",
608
+ 0.163818359375
609
+ ],
610
+ [
611
+ "2",
612
+ 0.00998687744140625
613
+ ],
614
+ [
615
+ "5",
616
+ 0.007083892822265625
617
+ ],
618
+ [
619
+ " ",
620
+ 0.00324249267578125
621
+ ],
622
+ [
623
+ "3",
624
+ 0.002410888671875
625
+ ],
626
+ [
627
+ "9",
628
+ 0.0019369125366210938
629
+ ],
630
+ [
631
+ "4",
632
+ 0.001605987548828125
633
+ ],
634
+ [
635
+ "8",
636
+ 0.0011568069458007812
637
+ ],
638
+ [
639
+ "6",
640
+ 0.0011386871337890625
641
+ ]
642
+ ]
643
+ },
644
+ {
645
+ "offset": [
646
+ 26,
647
+ 28
648
+ ],
649
+ "raw": "_q",
650
+ "real_topk": [
651
+ 0,
652
+ 5.960464477539063e-8
653
+ ],
654
+ "pred_topk": [
655
+ [
656
+ ".",
657
+ 0.3056640625
658
+ ],
659
+ [
660
+ "\n",
661
+ 0.0784912109375
662
+ ],
663
+ [
664
+ " top",
665
+ 0.06304931640625
666
+ ],
667
+ [
668
+ "\n\n",
669
+ 0.046875
670
+ ],
671
+ [
672
+ " ",
673
+ 0.037078857421875
674
+ ],
675
+ [
676
+ "0",
677
+ 0.0195465087890625
678
+ ],
679
+ [
680
+ "的",
681
+ 0.01451873779296875
682
+ ],
683
+ [
684
+ " 的",
685
+ 0.01096343994140625
686
+ ],
687
+ [
688
+ ",",
689
+ 0.01045989990234375
690
+ ],
691
+ [
692
+ " prompt",
693
+ 0.00937652587890625
694
+ ]
695
+ ]
696
+ },
697
+ {
698
+ "offset": [
699
+ 28,
700
+ 31
701
+ ],
702
+ "raw": "wen",
703
+ "real_topk": [
704
+ 0,
705
+ 0.46435546875
706
+ ],
707
+ "pred_topk": [
708
+ [
709
+ "wen",
710
+ 0.46435546875
711
+ ],
712
+ [
713
+ "a",
714
+ 0.126953125
715
+ ],
716
+ [
717
+ "w",
718
+ 0.048553466796875
719
+ ],
720
+ [
721
+ "1",
722
+ 0.02239990234375
723
+ ],
724
+ [
725
+ "2",
726
+ 0.02154541015625
727
+ ],
728
+ [
729
+ "q",
730
+ 0.02154541015625
731
+ ],
732
+ [
733
+ "qq",
734
+ 0.015045166015625
735
+ ],
736
+ [
737
+ "3",
738
+ 0.01380157470703125
739
+ ],
740
+ [
741
+ "0",
742
+ 0.01247406005859375
743
+ ],
744
+ [
745
+ "4",
746
+ 0.010498046875
747
+ ]
748
+ ]
749
+ },
750
+ {
751
+ "offset": [
752
+ 31,
753
+ 32
754
+ ],
755
+ "raw": "2",
756
+ "real_topk": [
757
+ 0,
758
+ 0.40673828125
759
+ ],
760
+ "pred_topk": [
761
+ [
762
+ "2",
763
+ 0.40673828125
764
+ ],
765
+ [
766
+ "-",
767
+ 0.08660888671875
768
+ ],
769
+ [
770
+ "_",
771
+ 0.07293701171875
772
+ ],
773
+ [
774
+ "\n",
775
+ 0.027252197265625
776
+ ],
777
+ [
778
+ "1",
779
+ 0.0244293212890625
780
+ ],
781
+ [
782
+ "是什么",
783
+ 0.0200958251953125
784
+ ],
785
+ [
786
+ "\n\n",
787
+ 0.01505279541015625
788
+ ],
789
+ [
790
+ " ",
791
+ 0.01100921630859375
792
+ ],
793
+ [
794
+ "的",
795
+ 0.00905609130859375
796
+ ],
797
+ [
798
+ "7",
799
+ 0.007110595703125
800
+ ]
801
+ ]
802
+ },
803
+ {
804
+ "offset": [
805
+ 32,
806
+ 33
807
+ ],
808
+ "raw": ".",
809
+ "real_topk": [
810
+ 0,
811
+ 0.326416015625
812
+ ],
813
+ "pred_topk": [
814
+ [
815
+ ".",
816
+ 0.326416015625
817
+ ],
818
+ [
819
+ "-",
820
+ 0.204345703125
821
+ ],
822
+ [
823
+ "_",
824
+ 0.0775146484375
825
+ ],
826
+ [
827
+ "-m",
828
+ 0.0325927734375
829
+ ],
830
+ [
831
+ "-in",
832
+ 0.0285186767578125
833
+ ],
834
+ [
835
+ "\n",
836
+ 0.01538848876953125
837
+ ],
838
+ [
839
+ " ",
840
+ 0.01412200927734375
841
+ ],
842
+ [
843
+ "数学",
844
+ 0.0099334716796875
845
+ ],
846
+ [
847
+ "-v",
848
+ 0.008697509765625
849
+ ],
850
+ [
851
+ ":",
852
+ 0.00749969482421875
853
+ ]
854
+ ]
855
+ },
856
+ {
857
+ "offset": [
858
+ 33,
859
+ 34
860
+ ],
861
+ "raw": "5",
862
+ "real_topk": [
863
+ 0,
864
+ 0.966796875
865
+ ],
866
+ "pred_topk": [
867
+ [
868
+ "5",
869
+ 0.966796875
870
+ ],
871
+ [
872
+ "0",
873
+ 0.0200653076171875
874
+ ],
875
+ [
876
+ "7",
877
+ 0.00704193115234375
878
+ ],
879
+ [
880
+ " ",
881
+ 0.0005345344543457031
882
+ ],
883
+ [
884
+ "1",
885
+ 0.00052642822265625
886
+ ],
887
+ [
888
+ "请",
889
+ 0.00039124488830566406
890
+ ],
891
+ [
892
+ "翻译",
893
+ 0.00021946430206298828
894
+ ],
895
+ [
896
+ "2",
897
+ 0.0001531839370727539
898
+ ],
899
+ [
900
+ "中文",
901
+ 0.00014388561248779297
902
+ ],
903
+ [
904
+ "3",
905
+ 0.0001373291015625
906
+ ]
907
+ ]
908
+ }
909
+ ]
910
+ }
911
+ }
data/demo/public/Wiki - Cristiano Ronaldo.json CHANGED
The diff for this file is too large to render. See raw diff
 
data/demo/public/human_ NYTimes article.json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/human_ academic text.json ADDED
The diff for this file is too large to render. See raw diff
 
data/demo/public/human_ woodchuck.json ADDED
@@ -0,0 +1,964 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "request": {
3
+ "text": "How much wood would a woodchuck chuck if a woodchuck could chuck wood?"
4
+ },
5
+ "result": {
6
+ "model": "qwen3.0-14b",
7
+ "bpe_strings": [
8
+ {
9
+ "offset": [
10
+ 0,
11
+ 3
12
+ ],
13
+ "raw": "How",
14
+ "real_topk": [
15
+ 0,
16
+ 0.033843994140625
17
+ ],
18
+ "pred_topk": [
19
+ [
20
+ "Human",
21
+ 0.303955078125
22
+ ],
23
+ [
24
+ "What",
25
+ 0.09124755859375
26
+ ],
27
+ [
28
+ "import",
29
+ 0.0360107421875
30
+ ],
31
+ [
32
+ "How",
33
+ 0.033843994140625
34
+ ],
35
+ [
36
+ "**",
37
+ 0.027618408203125
38
+ ],
39
+ [
40
+ "#",
41
+ 0.02740478515625
42
+ ],
43
+ [
44
+ "package",
45
+ 0.02362060546875
46
+ ],
47
+ [
48
+ "Question",
49
+ 0.016876220703125
50
+ ],
51
+ [
52
+ "标题",
53
+ 0.01490020751953125
54
+ ],
55
+ [
56
+ "The",
57
+ 0.01325225830078125
58
+ ]
59
+ ]
60
+ },
61
+ {
62
+ "offset": [
63
+ 3,
64
+ 8
65
+ ],
66
+ "raw": " much",
67
+ "real_topk": [
68
+ 0,
69
+ 0.0165252685546875
70
+ ],
71
+ "pred_topk": [
72
+ [
73
+ " do",
74
+ 0.38818359375
75
+ ],
76
+ [
77
+ " can",
78
+ 0.258544921875
79
+ ],
80
+ [
81
+ " does",
82
+ 0.08935546875
83
+ ],
84
+ [
85
+ " many",
86
+ 0.07122802734375
87
+ ],
88
+ [
89
+ " long",
90
+ 0.05462646484375
91
+ ],
92
+ [
93
+ " is",
94
+ 0.0250091552734375
95
+ ],
96
+ [
97
+ " did",
98
+ 0.0178680419921875
99
+ ],
100
+ [
101
+ " much",
102
+ 0.0165252685546875
103
+ ],
104
+ [
105
+ " would",
106
+ 0.00787353515625
107
+ ],
108
+ [
109
+ " to",
110
+ 0.007568359375
111
+ ]
112
+ ]
113
+ },
114
+ {
115
+ "offset": [
116
+ 8,
117
+ 13
118
+ ],
119
+ "raw": " wood",
120
+ "real_topk": [
121
+ 0,
122
+ 0.0013179779052734375
123
+ ],
124
+ "pred_topk": [
125
+ [
126
+ " does",
127
+ 0.17333984375
128
+ ],
129
+ [
130
+ " money",
131
+ 0.062286376953125
132
+ ],
133
+ [
134
+ " is",
135
+ 0.0516357421875
136
+ ],
137
+ [
138
+ " water",
139
+ 0.039581298828125
140
+ ],
141
+ [
142
+ " will",
143
+ 0.034393310546875
144
+ ],
145
+ [
146
+ " do",
147
+ 0.033599853515625
148
+ ],
149
+ [
150
+ " of",
151
+ 0.0330810546875
152
+ ],
153
+ [
154
+ " would",
155
+ 0.03106689453125
156
+ ],
157
+ [
158
+ " should",
159
+ 0.024200439453125
160
+ ],
161
+ [
162
+ " time",
163
+ 0.022735595703125
164
+ ]
165
+ ]
166
+ },
167
+ {
168
+ "offset": [
169
+ 13,
170
+ 19
171
+ ],
172
+ "raw": " would",
173
+ "real_topk": [
174
+ 0,
175
+ 0.35009765625
176
+ ],
177
+ "pred_topk": [
178
+ [
179
+ " would",
180
+ 0.35009765625
181
+ ],
182
+ [
183
+ " could",
184
+ 0.252197265625
185
+ ],
186
+ [
187
+ " is",
188
+ 0.10845947265625
189
+ ],
190
+ [
191
+ " can",
192
+ 0.076904296875
193
+ ],
194
+ [
195
+ " does",
196
+ 0.05987548828125
197
+ ],
198
+ [
199
+ " will",
200
+ 0.0261688232421875
201
+ ],
202
+ [
203
+ " did",
204
+ 0.019744873046875
205
+ ],
206
+ [
207
+ " do",
208
+ 0.019134521484375
209
+ ],
210
+ [
211
+ " was",
212
+ 0.0182647705078125
213
+ ],
214
+ [
215
+ " should",
216
+ 0.00823211669921875
217
+ ]
218
+ ]
219
+ },
220
+ {
221
+ "offset": [
222
+ 19,
223
+ 21
224
+ ],
225
+ "raw": " a",
226
+ "real_topk": [
227
+ 0,
228
+ 0.9658203125
229
+ ],
230
+ "pred_topk": [
231
+ [
232
+ " a",
233
+ 0.9658203125
234
+ ],
235
+ [
236
+ " the",
237
+ 0.005565643310546875
238
+ ],
239
+ [
240
+ " be",
241
+ 0.0046844482421875
242
+ ],
243
+ [
244
+ " you",
245
+ 0.004009246826171875
246
+ ],
247
+ [
248
+ " an",
249
+ 0.00359344482421875
250
+ ],
251
+ [
252
+ " ",
253
+ 0.0019235610961914062
254
+ ],
255
+ [
256
+ " I",
257
+ 0.0009374618530273438
258
+ ],
259
+ [
260
+ " fit",
261
+ 0.0008144378662109375
262
+ ],
263
+ [
264
+ " three",
265
+ 0.0007648468017578125
266
+ ],
267
+ [
268
+ " wood",
269
+ 0.0007648468017578125
270
+ ]
271
+ ]
272
+ },
273
+ {
274
+ "offset": [
275
+ 21,
276
+ 26
277
+ ],
278
+ "raw": " wood",
279
+ "real_topk": [
280
+ 0,
281
+ 0.9736328125
282
+ ],
283
+ "pred_topk": [
284
+ [
285
+ " wood",
286
+ 0.9736328125
287
+ ],
288
+ [
289
+ " lumber",
290
+ 0.0018215179443359375
291
+ ],
292
+ [
293
+ " man",
294
+ 0.00179290771484375
295
+ ],
296
+ [
297
+ " wooden",
298
+ 0.0008020401000976562
299
+ ],
300
+ [
301
+ " mocking",
302
+ 0.0007653236389160156
303
+ ],
304
+ [
305
+ " woo",
306
+ 0.000759124755859375
307
+ ],
308
+ [
309
+ " woodworking",
310
+ 0.0007190704345703125
311
+ ],
312
+ [
313
+ " would",
314
+ 0.000675201416015625
315
+ ],
316
+ [
317
+ " wo",
318
+ 0.0006546974182128906
319
+ ],
320
+ [
321
+ " saw",
322
+ 0.0006494522094726562
323
+ ]
324
+ ]
325
+ },
326
+ {
327
+ "offset": [
328
+ 26,
329
+ 28
330
+ ],
331
+ "raw": "ch",
332
+ "real_topk": [
333
+ 0,
334
+ 0.95458984375
335
+ ],
336
+ "pred_topk": [
337
+ [
338
+ "ch",
339
+ 0.95458984375
340
+ ],
341
+ [
342
+ " chuck",
343
+ 0.031646728515625
344
+ ],
345
+ [
346
+ "-ch",
347
+ 0.003337860107421875
348
+ ],
349
+ [
350
+ " ch",
351
+ 0.00240325927734375
352
+ ],
353
+ [
354
+ "pe",
355
+ 0.0012273788452148438
356
+ ],
357
+ [
358
+ "c",
359
+ 0.0010833740234375
360
+ ],
361
+ [
362
+ " chip",
363
+ 0.0005040168762207031
364
+ ],
365
+ [
366
+ " if",
367
+ 0.0003986358642578125
368
+ ],
369
+ [
370
+ " duck",
371
+ 0.00032520294189453125
372
+ ],
373
+ [
374
+ "chip",
375
+ 0.0003104209899902344
376
+ ]
377
+ ]
378
+ },
379
+ {
380
+ "offset": [
381
+ 28,
382
+ 31
383
+ ],
384
+ "raw": "uck",
385
+ "real_topk": [
386
+ 0,
387
+ 0.99560546875
388
+ ],
389
+ "pred_topk": [
390
+ [
391
+ "uck",
392
+ 0.99560546875
393
+ ],
394
+ [
395
+ "ucker",
396
+ 0.002246856689453125
397
+ ],
398
+ [
399
+ "opper",
400
+ 0.0007071495056152344
401
+ ],
402
+ [
403
+ "ucking",
404
+ 0.0003447532653808594
405
+ ],
406
+ [
407
+ "uk",
408
+ 0.00033926963806152344
409
+ ],
410
+ [
411
+ "ucks",
412
+ 0.00030422210693359375
413
+ ],
414
+ [
415
+ "ick",
416
+ 0.000025987625122070312
417
+ ],
418
+ [
419
+ "ew",
420
+ 0.000019729137420654297
421
+ ],
422
+ [
423
+ "ucky",
424
+ 0.00001811981201171875
425
+ ],
426
+ [
427
+ "ug",
428
+ 0.00001728534698486328
429
+ ]
430
+ ]
431
+ },
432
+ {
433
+ "offset": [
434
+ 31,
435
+ 37
436
+ ],
437
+ "raw": " chuck",
438
+ "real_topk": [
439
+ 0,
440
+ 0.99609375
441
+ ],
442
+ "pred_topk": [
443
+ [
444
+ " chuck",
445
+ 0.99609375
446
+ ],
447
+ [
448
+ ",",
449
+ 0.0004029273986816406
450
+ ],
451
+ [
452
+ " really",
453
+ 0.0003502368927001953
454
+ ],
455
+ [
456
+ " Chuck",
457
+ 0.000339508056640625
458
+ ],
459
+ [
460
+ " actually",
461
+ 0.000308990478515625
462
+ ],
463
+ [
464
+ " if",
465
+ 0.00028586387634277344
466
+ ],
467
+ [
468
+ " chew",
469
+ 0.0002770423889160156
470
+ ],
471
+ [
472
+ " chunk",
473
+ 0.00021910667419433594
474
+ ],
475
+ [
476
+ " wood",
477
+ 0.00015532970428466797
478
+ ],
479
+ [
480
+ " chop",
481
+ 0.00013709068298339844
482
+ ]
483
+ ]
484
+ },
485
+ {
486
+ "offset": [
487
+ 37,
488
+ 40
489
+ ],
490
+ "raw": " if",
491
+ "real_topk": [
492
+ 0,
493
+ 0.78564453125
494
+ ],
495
+ "pred_topk": [
496
+ [
497
+ " if",
498
+ 0.78564453125
499
+ ],
500
+ [
501
+ ",",
502
+ 0.140869140625
503
+ ],
504
+ [
505
+ "\n",
506
+ 0.016815185546875
507
+ ],
508
+ [
509
+ " If",
510
+ 0.012115478515625
511
+ ],
512
+ [
513
+ "?",
514
+ 0.01120758056640625
515
+ ],
516
+ [
517
+ "?\n",
518
+ 0.0088653564453125
519
+ ],
520
+ [
521
+ "?\n\n",
522
+ 0.0050506591796875
523
+ ],
524
+ [
525
+ ",\n",
526
+ 0.001857757568359375
527
+ ],
528
+ [
529
+ "\n\n",
530
+ 0.0015163421630859375
531
+ ],
532
+ [
533
+ " in",
534
+ 0.0011625289916992188
535
+ ]
536
+ ]
537
+ },
538
+ {
539
+ "offset": [
540
+ 40,
541
+ 42
542
+ ],
543
+ "raw": " a",
544
+ "real_topk": [
545
+ 0,
546
+ 0.94970703125
547
+ ],
548
+ "pred_topk": [
549
+ [
550
+ " a",
551
+ 0.94970703125
552
+ ],
553
+ [
554
+ " he",
555
+ 0.018524169921875
556
+ ],
557
+ [
558
+ " it",
559
+ 0.01233673095703125
560
+ ],
561
+ [
562
+ " the",
563
+ 0.008087158203125
564
+ ],
565
+ [
566
+ " wood",
567
+ 0.0017223358154296875
568
+ ],
569
+ [
570
+ " ",
571
+ 0.0014734268188476562
572
+ ],
573
+ [
574
+ " one",
575
+ 0.0006437301635742188
576
+ ],
577
+ [
578
+ ",",
579
+ 0.0004494190216064453
580
+ ],
581
+ [
582
+ " you",
583
+ 0.0003554821014404297
584
+ ],
585
+ [
586
+ " an",
587
+ 0.00029015541076660156
588
+ ]
589
+ ]
590
+ },
591
+ {
592
+ "offset": [
593
+ 42,
594
+ 47
595
+ ],
596
+ "raw": " wood",
597
+ "real_topk": [
598
+ 0,
599
+ 0.99267578125
600
+ ],
601
+ "pred_topk": [
602
+ [
603
+ " wood",
604
+ 0.99267578125
605
+ ],
606
+ [
607
+ " chuck",
608
+ 0.0004410743713378906
609
+ ],
610
+ [
611
+ "\n",
612
+ 0.0004410743713378906
613
+ ],
614
+ [
615
+ " wooden",
616
+ 0.0003032684326171875
617
+ ],
618
+ [
619
+ " ",
620
+ 0.00027179718017578125
621
+ ],
622
+ [
623
+ " tree",
624
+ 0.00025534629821777344
625
+ ],
626
+ [
627
+ " woods",
628
+ 0.00018978118896484375
629
+ ],
630
+ [
631
+ " Wood",
632
+ 0.00017547607421875
633
+ ],
634
+ [
635
+ " w",
636
+ 0.00013566017150878906
637
+ ],
638
+ [
639
+ " ch",
640
+ 0.00011688470840454102
641
+ ]
642
+ ]
643
+ },
644
+ {
645
+ "offset": [
646
+ 47,
647
+ 49
648
+ ],
649
+ "raw": "ch",
650
+ "real_topk": [
651
+ 0,
652
+ 0.99609375
653
+ ],
654
+ "pred_topk": [
655
+ [
656
+ "ch",
657
+ 0.99609375
658
+ ],
659
+ [
660
+ " chuck",
661
+ 0.0021800994873046875
662
+ ],
663
+ [
664
+ "-ch",
665
+ 0.0008945465087890625
666
+ ],
667
+ [
668
+ " ch",
669
+ 0.00017893314361572266
670
+ ],
671
+ [
672
+ "c",
673
+ 0.00009572505950927734
674
+ ],
675
+ [
676
+ " could",
677
+ 0.00005900859832763672
678
+ ],
679
+ [
680
+ " would",
681
+ 0.00003522634506225586
682
+ ],
683
+ [
684
+ "cock",
685
+ 0.00002205371856689453
686
+ ],
687
+ [
688
+ "chu",
689
+ 0.000010251998901367188
690
+ ],
691
+ [
692
+ "ck",
693
+ 0.000008881092071533203
694
+ ]
695
+ ]
696
+ },
697
+ {
698
+ "offset": [
699
+ 49,
700
+ 52
701
+ ],
702
+ "raw": "uck",
703
+ "real_topk": [
704
+ 0,
705
+ 0.99951171875
706
+ ],
707
+ "pred_topk": [
708
+ [
709
+ "uck",
710
+ 0.99951171875
711
+ ],
712
+ [
713
+ "ucks",
714
+ 0.0003459453582763672
715
+ ],
716
+ [
717
+ "ucker",
718
+ 0.0000393986701965332
719
+ ],
720
+ [
721
+ "ick",
722
+ 0.00002467632293701172
723
+ ],
724
+ [
725
+ "ck",
726
+ 0.00000476837158203125
727
+ ],
728
+ [
729
+ "uk",
730
+ 0.000004589557647705078
731
+ ],
732
+ [
733
+ "ump",
734
+ 0.0000023245811462402344
735
+ ],
736
+ [
737
+ "uckle",
738
+ 0.0000016689300537109375
739
+ ],
740
+ [
741
+ "ock",
742
+ 0.0000010132789611816406
743
+ ],
744
+ [
745
+ "ack",
746
+ 8.344650268554688e-7
747
+ ]
748
+ ]
749
+ },
750
+ {
751
+ "offset": [
752
+ 52,
753
+ 58
754
+ ],
755
+ "raw": " could",
756
+ "real_topk": [
757
+ 0,
758
+ 0.97900390625
759
+ ],
760
+ "pred_topk": [
761
+ [
762
+ " could",
763
+ 0.97900390625
764
+ ],
765
+ [
766
+ " would",
767
+ 0.0113983154296875
768
+ ],
769
+ [
770
+ " can",
771
+ 0.0020122528076171875
772
+ ],
773
+ [
774
+ " ch",
775
+ 0.001979827880859375
776
+ ],
777
+ [
778
+ " were",
779
+ 0.0010938644409179688
780
+ ],
781
+ [
782
+ " chuck",
783
+ 0.000980377197265625
784
+ ],
785
+ [
786
+ " was",
787
+ 0.0005583763122558594
788
+ ],
789
+ [
790
+ " couldn",
791
+ 0.0002853870391845703
792
+ ],
793
+ [
794
+ " had",
795
+ 0.00023281574249267578
796
+ ],
797
+ [
798
+ ",",
799
+ 0.00022923946380615234
800
+ ]
801
+ ]
802
+ },
803
+ {
804
+ "offset": [
805
+ 58,
806
+ 64
807
+ ],
808
+ "raw": " chuck",
809
+ "real_topk": [
810
+ 0,
811
+ 0.98974609375
812
+ ],
813
+ "pred_topk": [
814
+ [
815
+ " chuck",
816
+ 0.98974609375
817
+ ],
818
+ [
819
+ " actually",
820
+ 0.004444122314453125
821
+ ],
822
+ [
823
+ " indeed",
824
+ 0.0013980865478515625
825
+ ],
826
+ [
827
+ " really",
828
+ 0.0011587142944335938
829
+ ],
830
+ [
831
+ "?",
832
+ 0.00034809112548828125
833
+ ],
834
+ [
835
+ " Chuck",
836
+ 0.00030231475830078125
837
+ ],
838
+ [
839
+ "?\n\n",
840
+ 0.0002753734588623047
841
+ ],
842
+ [
843
+ "?\n",
844
+ 0.00025463104248046875
845
+ ],
846
+ [
847
+ ",",
848
+ 0.00017499923706054688
849
+ ],
850
+ [
851
+ " in",
852
+ 0.00009363889694213867
853
+ ]
854
+ ]
855
+ },
856
+ {
857
+ "offset": [
858
+ 64,
859
+ 69
860
+ ],
861
+ "raw": " wood",
862
+ "real_topk": [
863
+ 0,
864
+ 0.9873046875
865
+ ],
866
+ "pred_topk": [
867
+ [
868
+ " wood",
869
+ 0.9873046875
870
+ ],
871
+ [
872
+ "?",
873
+ 0.0028171539306640625
874
+ ],
875
+ [
876
+ "?\n\n",
877
+ 0.001483917236328125
878
+ ],
879
+ [
880
+ "?\n",
881
+ 0.0013103485107421875
882
+ ],
883
+ [
884
+ ",",
885
+ 0.0010204315185546875
886
+ ],
887
+ [
888
+ " like",
889
+ 0.0007944107055664062
890
+ ],
891
+ [
892
+ " if",
893
+ 0.0005459785461425781
894
+ ],
895
+ [
896
+ " some",
897
+ 0.0003695487976074219
898
+ ],
899
+ [
900
+ " a",
901
+ 0.0002923011779785156
902
+ ],
903
+ [
904
+ " would",
905
+ 0.000274658203125
906
+ ]
907
+ ]
908
+ },
909
+ {
910
+ "offset": [
911
+ 69,
912
+ 70
913
+ ],
914
+ "raw": "?",
915
+ "real_topk": [
916
+ 0,
917
+ 0.40771484375
918
+ ],
919
+ "pred_topk": [
920
+ [
921
+ "?",
922
+ 0.40771484375
923
+ ],
924
+ [
925
+ ",",
926
+ 0.2359619140625
927
+ ],
928
+ [
929
+ "?\n\n",
930
+ 0.2181396484375
931
+ ],
932
+ [
933
+ "?\n",
934
+ 0.0938720703125
935
+ ],
936
+ [
937
+ " and",
938
+ 0.0085906982421875
939
+ ],
940
+ [
941
+ "\n",
942
+ 0.003696441650390625
943
+ ],
944
+ [
945
+ ".",
946
+ 0.00341796875
947
+ ],
948
+ [
949
+ "\n\n",
950
+ 0.002140045166015625
951
+ ],
952
+ [
953
+ "????",
954
+ 0.0019178390502929688
955
+ ],
956
+ [
957
+ " in",
958
+ 0.0014934539794921875
959
+ ]
960
+ ]
961
+ }
962
+ ]
963
+ }
964
+ }