Jellyfish042 commited on
Commit
8f8c689
·
1 Parent(s): 44c2c6d

bug fix and improvements

Browse files
precomputed/example_metadata.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "example_text": "The Bitter Lesson\nRich Sutton\nMarch 13, 2019\nThe biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore's law, or rather its generalization of continued exponentially falling cost per unit of computation. Most AI research has been conducted as if the computation available to the agent were constant (in which case leveraging human knowledge would be one of the only ways to improve performance) but, over a slightly longer time than a typical research project, massively more computation inevitably becomes available. Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation. These two need not run counter to each other, but in practice they tend to. Time spent on one is time not spent on the other. There are psychological commitments to investment in one approach or the other. And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation. There were many examples of AI researchers' belated learning of this bitter lesson, and it is instructive to review some of the most prominent.\n\nIn computer chess, the methods that defeated the world champion, Kasparov, in 1997, were based on massive, deep search. At the time, this was looked upon with dismay by the majority of computer-chess researchers who had pursued methods that leveraged human understanding of the special structure of chess. When a simpler, search-based approach with special hardware and software proved vastly more effective, these human-knowledge-based chess researchers were not good losers. They said that ``brute force\" search may have won this time, but it was not a general strategy, and anyway it was not how people played chess. These researchers wanted methods based on human input to win and were disappointed when they did not.\n\nA similar pattern of research progress was seen in computer Go, only delayed by a further 20 years. Enormous initial efforts went into avoiding search by taking advantage of human knowledge, or of the special features of the game, but all those efforts proved irrelevant, or worse, once search was applied effectively at scale. Also important was the use of learning by self play to learn a value function (as it was in many other games and even in chess, although learning did not play a big role in the 1997 program that first beat a world champion). Learning by self play, and learning in general, is like search in that it enables massive computation to be brought to bear. Search and learning are the two most important classes of techniques for utilizing massive amounts of computation in AI research. In computer Go, as in computer chess, researchers' initial effort was directed towards utilizing human understanding (so that less search was needed) and only much later was much greater success had by embracing search and learning.\n\nIn speech recognition, there was an early competition, sponsored by DARPA, in the 1970s. Entrants included a host of special methods that took advantage of human knowledge---knowledge of words, of phonemes, of the human vocal tract, etc. On the other side were newer methods that were more statistical in nature and did much more computation, based on hidden Markov models (HMMs). Again, the statistical methods won out over the human-knowledge-based methods. This led to a major change in all of natural language processing, gradually over decades, where statistics and computation came to dominate the field. The recent rise of deep learning in speech recognition is the most recent step in this consistent direction. Deep learning methods rely even less on human knowledge, and use even more computation, together with learning on huge training sets, to produce dramatically better speech recognition systems. As in the games, researchers always tried to make systems that worked the way the researchers thought their own minds worked---they tried to put that knowledge in their systems---but it proved ultimately counterproductive, and a colossal waste of researcher's time, when, through Moore's law, massive computation became available and a means was found to put it to good use.\n\nIn computer vision, there has been a similar pattern. Early methods conceived of vision as searching for edges, or generalized cylinders, or in terms of SIFT features. But today all this is discarded. Modern deep-learning neural networks use only the notions of convolution and certain kinds of invariances, and perform much better.\n\nThis is a big lesson. As a field, we still have not thoroughly learned it, as we are continuing to make the same kind of mistakes. To see this, and to effectively resist it, we have to understand the appeal of these mistakes. We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.\n\nOne thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.\n\nThe second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries. All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity. Essential to these methods is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done.\n",
3
- "qwen_inference_time": 20.516680479049683,
4
- "rwkv_inference_time": 31.14354944229126,
5
  "qwen_compression_rate": 48.14428559434192,
6
  "rwkv_compression_rate": 47.62502588510778
7
  }
 
1
  {
2
  "example_text": "The Bitter Lesson\nRich Sutton\nMarch 13, 2019\nThe biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore's law, or rather its generalization of continued exponentially falling cost per unit of computation. Most AI research has been conducted as if the computation available to the agent were constant (in which case leveraging human knowledge would be one of the only ways to improve performance) but, over a slightly longer time than a typical research project, massively more computation inevitably becomes available. Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation. These two need not run counter to each other, but in practice they tend to. Time spent on one is time not spent on the other. There are psychological commitments to investment in one approach or the other. And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation. There were many examples of AI researchers' belated learning of this bitter lesson, and it is instructive to review some of the most prominent.\n\nIn computer chess, the methods that defeated the world champion, Kasparov, in 1997, were based on massive, deep search. At the time, this was looked upon with dismay by the majority of computer-chess researchers who had pursued methods that leveraged human understanding of the special structure of chess. When a simpler, search-based approach with special hardware and software proved vastly more effective, these human-knowledge-based chess researchers were not good losers. They said that ``brute force\" search may have won this time, but it was not a general strategy, and anyway it was not how people played chess. These researchers wanted methods based on human input to win and were disappointed when they did not.\n\nA similar pattern of research progress was seen in computer Go, only delayed by a further 20 years. Enormous initial efforts went into avoiding search by taking advantage of human knowledge, or of the special features of the game, but all those efforts proved irrelevant, or worse, once search was applied effectively at scale. Also important was the use of learning by self play to learn a value function (as it was in many other games and even in chess, although learning did not play a big role in the 1997 program that first beat a world champion). Learning by self play, and learning in general, is like search in that it enables massive computation to be brought to bear. Search and learning are the two most important classes of techniques for utilizing massive amounts of computation in AI research. In computer Go, as in computer chess, researchers' initial effort was directed towards utilizing human understanding (so that less search was needed) and only much later was much greater success had by embracing search and learning.\n\nIn speech recognition, there was an early competition, sponsored by DARPA, in the 1970s. Entrants included a host of special methods that took advantage of human knowledge---knowledge of words, of phonemes, of the human vocal tract, etc. On the other side were newer methods that were more statistical in nature and did much more computation, based on hidden Markov models (HMMs). Again, the statistical methods won out over the human-knowledge-based methods. This led to a major change in all of natural language processing, gradually over decades, where statistics and computation came to dominate the field. The recent rise of deep learning in speech recognition is the most recent step in this consistent direction. Deep learning methods rely even less on human knowledge, and use even more computation, together with learning on huge training sets, to produce dramatically better speech recognition systems. As in the games, researchers always tried to make systems that worked the way the researchers thought their own minds worked---they tried to put that knowledge in their systems---but it proved ultimately counterproductive, and a colossal waste of researcher's time, when, through Moore's law, massive computation became available and a means was found to put it to good use.\n\nIn computer vision, there has been a similar pattern. Early methods conceived of vision as searching for edges, or generalized cylinders, or in terms of SIFT features. But today all this is discarded. Modern deep-learning neural networks use only the notions of convolution and certain kinds of invariances, and perform much better.\n\nThis is a big lesson. As a field, we still have not thoroughly learned it, as we are continuing to make the same kind of mistakes. To see this, and to effectively resist it, we have to understand the appeal of these mistakes. We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.\n\nOne thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.\n\nThe second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries. All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity. Essential to these methods is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done.\n",
3
+ "qwen_inference_time": 20.116603136062622,
4
+ "rwkv_inference_time": 31.04107141494751,
5
  "qwen_compression_rate": 48.14428559434192,
6
  "rwkv_compression_rate": 47.62502588510778
7
  }
precomputed/example_visualization.html CHANGED
The diff for this file is too large to render. See raw diff
 
visualization/html_generator.py CHANGED
@@ -166,9 +166,13 @@ def generate_comparison_html(
166
  HTML string with interactive visualization
167
  """
168
 
169
- def decode_token(token_id: int, tokenizer, model_type: str) -> str:
170
- """Decode a single token ID to text using the appropriate tokenizer."""
 
 
171
  def bytes_to_hex_str(byte_values) -> str:
 
 
172
  return "".join([f"\\x{b:02x}" for b in byte_values])
173
 
174
  def get_bytes_converter(tokenizer):
@@ -189,7 +193,7 @@ def generate_comparison_html(
189
  return _token_bytes_converter_cache.get(key)
190
 
191
  if tokenizer is None:
192
- return f"[{token_id}]"
193
  try:
194
  if model_type in ["rwkv", "rwkv7"]:
195
  # RWKV tokenizer provides raw bytes
@@ -197,10 +201,10 @@ def generate_comparison_html(
197
  if token_bytes:
198
  try:
199
  decoded = token_bytes.decode("utf-8")
200
- return decoded if decoded else f"[{token_id}]"
201
  except UnicodeDecodeError:
202
- return bytes_to_hex_str(token_bytes)
203
- return f"[{token_id}]"
204
  else:
205
  # HuggingFace tokenizer: prefer raw bytes when possible
206
  converter = get_bytes_converter(tokenizer)
@@ -213,17 +217,17 @@ def generate_comparison_html(
213
  if token_bytes:
214
  try:
215
  decoded = bytes(token_bytes).decode("utf-8")
216
- return decoded if decoded else f"[{token_id}]"
217
  except UnicodeDecodeError:
218
- return bytes_to_hex_str(token_bytes)
219
 
220
  decoded = tokenizer.decode([token_id])
221
  if decoded and "�" not in decoded:
222
- return decoded
223
- return decoded if decoded else f"[{token_id}]"
224
  except Exception as e:
225
  print(f"Warning: Failed to decode token {token_id} ({model_type}): {e}")
226
- return f"[{token_id}]"
227
 
228
  def build_byte_to_token_map(text: str, tokenizer, model_type: str):
229
  """Build mapping from byte position to token index using the correct tokenizer.
@@ -384,30 +388,30 @@ def generate_comparison_html(
384
  model_b_token_idx = find_token_for_byte(byte_start, model_b_token_ranges)
385
 
386
  # Build token info strings showing all tokens in this byte range
387
- def token_bytes_to_display_text(token_bytes: bytes) -> str:
388
  if token_bytes is None:
389
- return ""
390
  if isinstance(token_bytes, list):
391
  token_bytes = bytes(token_bytes)
392
  if isinstance(token_bytes, str):
393
- return token_bytes
394
  if len(token_bytes) == 0:
395
- return ""
396
  try:
397
- return token_bytes.decode("utf-8")
398
  except UnicodeDecodeError:
399
- return "".join([f"\\x{b:02x}" for b in token_bytes])
400
 
401
  # Model A (RWKV7) - tokens overlapping this byte range
402
  model_a_info = ""
403
  if token["rwkv_tokens"]:
404
- model_a_list = [[tid, token_bytes_to_display_text(tb)] for tid, tb in token["rwkv_tokens"]]
405
  model_a_info = base64.b64encode(json.dumps(model_a_list, ensure_ascii=False).encode("utf-8")).decode("ascii")
406
 
407
  # Model B (Qwen3) - tokens overlapping this byte range
408
  model_b_info = ""
409
  if token["qwen_tokens"]:
410
- model_b_list = [[tid, token_bytes_to_display_text(tb)] for tid, tb in token["qwen_tokens"]]
411
  model_b_info = base64.b64encode(json.dumps(model_b_list, ensure_ascii=False).encode("utf-8")).decode("ascii")
412
 
413
  raw_bytes = list(text_bytes[byte_start:byte_end])
@@ -435,13 +439,13 @@ def generate_comparison_html(
435
  actual_id,
436
  rank,
437
  actual_prob,
438
- [[tid, prob, decode_token(tid, tokenizer_a, model_type_a)] for tid, prob in topk_list],
439
  ]
440
  else:
441
  decoded_pred = [
442
  pred[0],
443
  pred[1],
444
- [[tid, prob, decode_token(tid, tokenizer_a, model_type_a)] for tid, prob in pred[2]],
445
  ]
446
  topk_a_json = base64.b64encode(json.dumps(decoded_pred, ensure_ascii=False).encode("utf-8")).decode("ascii")
447
  except Exception as e:
@@ -457,10 +461,10 @@ def generate_comparison_html(
457
  actual_id,
458
  rank,
459
  actual_prob,
460
- [[tid, prob, decode_token(tid, tokenizer_b, model_type_b)] for tid, prob in topk_list],
461
  ]
462
  else:
463
- decoded_pred = [pred[0], pred[1], [[tid, prob, decode_token(tid, tokenizer_b, model_type_b)] for tid, prob in pred[2]]]
464
  topk_b_json = base64.b64encode(json.dumps(decoded_pred, ensure_ascii=False).encode("utf-8")).decode("ascii")
465
  except Exception as e:
466
  pass
@@ -734,6 +738,12 @@ def generate_comparison_html(
734
  display: inline-block;
735
  max-width: 100%;
736
  }}
 
 
 
 
 
 
737
  #tooltip .topk-prob {{
738
  color: #86efac;
739
  min-width: 45px;
@@ -800,7 +810,30 @@ def generate_comparison_html(
800
  words.forEach(w => w.classList.remove('highlighted'));
801
  }}
802
 
803
- function drawLines(hoveredWord) {{
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
804
  clearLines();
805
 
806
  const wordText = hoveredWord.getAttribute('data-word');
@@ -816,12 +849,13 @@ def generate_comparison_html(
816
 
817
  sameWords.forEach(w => w.classList.add('highlighted'));
818
 
819
- const hoveredRect = hoveredWord.getBoundingClientRect();
 
820
  const hoveredX = hoveredRect.left + hoveredRect.width / 2;
821
  const hoveredY = hoveredRect.top + hoveredRect.height / 2;
822
 
823
  previousWords.forEach(prevWord => {{
824
- const prevRect = prevWord.getBoundingClientRect();
825
  const prevX = prevRect.left + prevRect.width / 2;
826
  const prevY = prevRect.top + prevRect.height / 2;
827
 
@@ -850,7 +884,7 @@ def generate_comparison_html(
850
  }}
851
 
852
  words.forEach(word => {{
853
- word.addEventListener('mouseenter', () => drawLines(word));
854
  word.addEventListener('mouseleave', clearLines);
855
  }});
856
 
@@ -904,6 +938,14 @@ def generate_comparison_html(
904
  return out;
905
  }}
906
 
 
 
 
 
 
 
 
 
907
  function formatTopkColumn(topkBase64, modelName, titleClass) {{
908
  if (!topkBase64) return '<div class="topk-column"><div class="topk-title ' + titleClass + '">' + modelName + '</div><div class="topk-list">N/A</div></div>';
909
  try {{
@@ -921,19 +963,30 @@ def generate_comparison_html(
921
  html += '<div class="topk-title ' + titleClass + '">' + modelName + '</div>';
922
  html += '<div class="topk-list">';
923
  topkList.forEach((item, idx) => {{
924
- const [tokenId, prob, tokenText] = item;
 
 
 
925
  const isHit = tokenId === actualId;
926
  const rankClass = isHit ? 'topk-rank hit' : 'topk-rank';
927
  const rawText = (tokenText !== undefined && tokenText !== null) ? tokenText : '';
928
- const visibleText = escapeControlChars(rawText);
929
- const displayText = (visibleText !== '') ? visibleText : ('[' + tokenId + ']');
930
- const escapedText = displayText
931
- .replace(/&/g, '&amp;')
932
- .replace(/</g, '&lt;')
933
- .replace(/>/g, '&gt;');
 
 
 
 
 
 
 
 
934
  html += '<div class="topk-item">';
935
  html += '<span class="' + rankClass + '">' + (idx + 1) + '.</span>';
936
- html += '<span class="topk-token" title="ID: ' + tokenId + '">' + escapedText + '</span>';
937
  html += '<span class="topk-prob">' + (prob * 100).toFixed(1) + '%</span>';
938
  if (isHit) html += '<span class="topk-hit">✓</span>';
939
  html += '</div>';
@@ -967,15 +1020,24 @@ def generate_comparison_html(
967
  tokenList.forEach((item) => {{
968
  const tokenId = item[0];
969
  const tokenText = item[1];
970
- const visible = escapeControlChars(tokenText || '');
971
- const displayText = (visible !== '') ? visible : '';
972
- const escapedText = displayText
973
- .replace(/&/g, '&amp;')
974
- .replace(/</g, '&lt;')
975
- .replace(/>/g, '&gt;');
 
 
 
 
 
 
 
 
 
976
  html += '<span class="token-chip-group" title="ID: ' + tokenId + '">';
977
  html += '<span class="token-id">[' + tokenId + ']</span>';
978
- html += '<span class="topk-token token-chip">' + escapedText + '</span>';
979
  html += '</span>';
980
  }});
981
  html += '</div></div>';
 
166
  HTML string with interactive visualization
167
  """
168
 
169
+ def decode_token(token_id: int, tokenizer, model_type: str) -> Tuple[str, bool]:
170
+ """Decode a single token ID to text using the appropriate tokenizer.
171
+ Returns (text, is_raw_bytes).
172
+ """
173
  def bytes_to_hex_str(byte_values) -> str:
174
+ if isinstance(byte_values, list):
175
+ byte_values = bytes(byte_values)
176
  return "".join([f"\\x{b:02x}" for b in byte_values])
177
 
178
  def get_bytes_converter(tokenizer):
 
193
  return _token_bytes_converter_cache.get(key)
194
 
195
  if tokenizer is None:
196
+ return f"[{token_id}]", False
197
  try:
198
  if model_type in ["rwkv", "rwkv7"]:
199
  # RWKV tokenizer provides raw bytes
 
201
  if token_bytes:
202
  try:
203
  decoded = token_bytes.decode("utf-8")
204
+ return (decoded if decoded else f"[{token_id}]"), False
205
  except UnicodeDecodeError:
206
+ return bytes_to_hex_str(token_bytes), True
207
+ return f"[{token_id}]", False
208
  else:
209
  # HuggingFace tokenizer: prefer raw bytes when possible
210
  converter = get_bytes_converter(tokenizer)
 
217
  if token_bytes:
218
  try:
219
  decoded = bytes(token_bytes).decode("utf-8")
220
+ return (decoded if decoded else f"[{token_id}]"), False
221
  except UnicodeDecodeError:
222
+ return bytes_to_hex_str(token_bytes), True
223
 
224
  decoded = tokenizer.decode([token_id])
225
  if decoded and "�" not in decoded:
226
+ return decoded, False
227
+ return (decoded if decoded else f"[{token_id}]"), False
228
  except Exception as e:
229
  print(f"Warning: Failed to decode token {token_id} ({model_type}): {e}")
230
+ return f"[{token_id}]", False
231
 
232
  def build_byte_to_token_map(text: str, tokenizer, model_type: str):
233
  """Build mapping from byte position to token index using the correct tokenizer.
 
388
  model_b_token_idx = find_token_for_byte(byte_start, model_b_token_ranges)
389
 
390
  # Build token info strings showing all tokens in this byte range
391
+ def token_bytes_to_display_text(token_bytes: bytes) -> Tuple[str, bool]:
392
  if token_bytes is None:
393
+ return "", False
394
  if isinstance(token_bytes, list):
395
  token_bytes = bytes(token_bytes)
396
  if isinstance(token_bytes, str):
397
+ return token_bytes, False
398
  if len(token_bytes) == 0:
399
+ return "", False
400
  try:
401
+ return token_bytes.decode("utf-8"), False
402
  except UnicodeDecodeError:
403
+ return "".join([f"\\x{b:02x}" for b in token_bytes]), True
404
 
405
  # Model A (RWKV7) - tokens overlapping this byte range
406
  model_a_info = ""
407
  if token["rwkv_tokens"]:
408
+ model_a_list = [[tid, *token_bytes_to_display_text(tb)] for tid, tb in token["rwkv_tokens"]]
409
  model_a_info = base64.b64encode(json.dumps(model_a_list, ensure_ascii=False).encode("utf-8")).decode("ascii")
410
 
411
  # Model B (Qwen3) - tokens overlapping this byte range
412
  model_b_info = ""
413
  if token["qwen_tokens"]:
414
+ model_b_list = [[tid, *token_bytes_to_display_text(tb)] for tid, tb in token["qwen_tokens"]]
415
  model_b_info = base64.b64encode(json.dumps(model_b_list, ensure_ascii=False).encode("utf-8")).decode("ascii")
416
 
417
  raw_bytes = list(text_bytes[byte_start:byte_end])
 
439
  actual_id,
440
  rank,
441
  actual_prob,
442
+ [[tid, prob, *decode_token(tid, tokenizer_a, model_type_a)] for tid, prob in topk_list],
443
  ]
444
  else:
445
  decoded_pred = [
446
  pred[0],
447
  pred[1],
448
+ [[tid, prob, *decode_token(tid, tokenizer_a, model_type_a)] for tid, prob in pred[2]],
449
  ]
450
  topk_a_json = base64.b64encode(json.dumps(decoded_pred, ensure_ascii=False).encode("utf-8")).decode("ascii")
451
  except Exception as e:
 
461
  actual_id,
462
  rank,
463
  actual_prob,
464
+ [[tid, prob, *decode_token(tid, tokenizer_b, model_type_b)] for tid, prob in topk_list],
465
  ]
466
  else:
467
+ decoded_pred = [pred[0], pred[1], [[tid, prob, *decode_token(tid, tokenizer_b, model_type_b)] for tid, prob in pred[2]]]
468
  topk_b_json = base64.b64encode(json.dumps(decoded_pred, ensure_ascii=False).encode("utf-8")).decode("ascii")
469
  except Exception as e:
470
  pass
 
738
  display: inline-block;
739
  max-width: 100%;
740
  }}
741
+ #tooltip .esc-control {{
742
+ color: #fbbf24;
743
+ }}
744
+ #tooltip .esc-raw {{
745
+ color: #fb7185;
746
+ }}
747
  #tooltip .topk-prob {{
748
  color: #86efac;
749
  min-width: 45px;
 
810
  words.forEach(w => w.classList.remove('highlighted'));
811
  }}
812
 
813
+ function pickRectByY(rects, targetY) {{
814
+ if (!rects || rects.length === 0) return null;
815
+ let best = rects[0];
816
+ let bestDist = Infinity;
817
+ rects.forEach(r => {{
818
+ const cy = r.top + r.height / 2;
819
+ const dist = Math.abs(cy - targetY);
820
+ if (dist < bestDist) {{
821
+ best = r;
822
+ bestDist = dist;
823
+ }}
824
+ }});
825
+ return best;
826
+ }}
827
+
828
+ function getAnchorRect(element, targetY) {{
829
+ const rects = Array.from(element.getClientRects());
830
+ if (rects.length === 0) return element.getBoundingClientRect();
831
+ if (rects.length === 1) return rects[0];
832
+ const picked = pickRectByY(rects, targetY);
833
+ return picked || rects[0];
834
+ }}
835
+
836
+ function drawLines(hoveredWord, evt) {{
837
  clearLines();
838
 
839
  const wordText = hoveredWord.getAttribute('data-word');
 
849
 
850
  sameWords.forEach(w => w.classList.add('highlighted'));
851
 
852
+ const targetY = evt ? evt.clientY : (hoveredWord.getBoundingClientRect().top + hoveredWord.getBoundingClientRect().height / 2);
853
+ const hoveredRect = getAnchorRect(hoveredWord, targetY);
854
  const hoveredX = hoveredRect.left + hoveredRect.width / 2;
855
  const hoveredY = hoveredRect.top + hoveredRect.height / 2;
856
 
857
  previousWords.forEach(prevWord => {{
858
+ const prevRect = getAnchorRect(prevWord, hoveredY);
859
  const prevX = prevRect.left + prevRect.width / 2;
860
  const prevY = prevRect.top + prevRect.height / 2;
861
 
 
884
  }}
885
 
886
  words.forEach(word => {{
887
+ word.addEventListener('mouseenter', (e) => drawLines(word, e));
888
  word.addEventListener('mouseleave', clearLines);
889
  }});
890
 
 
938
  return out;
939
  }}
940
 
941
+ function renderEscapedWithControlColor(text) {{
942
+ const escaped = (text || '')
943
+ .replace(/&/g, '&amp;')
944
+ .replace(/</g, '&lt;')
945
+ .replace(/>/g, '&gt;');
946
+ return escaped.replace(/\\\\(x[0-9a-fA-F]{2}|[nrt])/g, '<span class="esc-control">\\\\$1</span>');
947
+ }}
948
+
949
  function formatTopkColumn(topkBase64, modelName, titleClass) {{
950
  if (!topkBase64) return '<div class="topk-column"><div class="topk-title ' + titleClass + '">' + modelName + '</div><div class="topk-list">N/A</div></div>';
951
  try {{
 
963
  html += '<div class="topk-title ' + titleClass + '">' + modelName + '</div>';
964
  html += '<div class="topk-list">';
965
  topkList.forEach((item, idx) => {{
966
+ const tokenId = item[0];
967
+ const prob = item[1];
968
+ const tokenText = item[2];
969
+ const isRaw = item.length > 3 ? item[3] : false;
970
  const isHit = tokenId === actualId;
971
  const rankClass = isHit ? 'topk-rank hit' : 'topk-rank';
972
  const rawText = (tokenText !== undefined && tokenText !== null) ? tokenText : '';
973
+ let displayText = '';
974
+ let htmlText = '';
975
+ if (isRaw) {{
976
+ displayText = (rawText !== '') ? rawText : ('[' + tokenId + ']');
977
+ const escapedText = displayText
978
+ .replace(/&/g, '&amp;')
979
+ .replace(/</g, '&lt;')
980
+ .replace(/>/g, '&gt;');
981
+ htmlText = '<span class="esc-raw">' + escapedText + '</span>';
982
+ }} else {{
983
+ const visibleText = escapeControlChars(rawText);
984
+ displayText = (visibleText !== '') ? visibleText : ('[' + tokenId + ']');
985
+ htmlText = renderEscapedWithControlColor(displayText);
986
+ }}
987
  html += '<div class="topk-item">';
988
  html += '<span class="' + rankClass + '">' + (idx + 1) + '.</span>';
989
+ html += '<span class="topk-token" title="ID: ' + tokenId + '">' + htmlText + '</span>';
990
  html += '<span class="topk-prob">' + (prob * 100).toFixed(1) + '%</span>';
991
  if (isHit) html += '<span class="topk-hit">✓</span>';
992
  html += '</div>';
 
1020
  tokenList.forEach((item) => {{
1021
  const tokenId = item[0];
1022
  const tokenText = item[1];
1023
+ const isRaw = item.length > 2 ? item[2] : false;
1024
+ let displayText = '';
1025
+ let htmlText = '';
1026
+ if (isRaw) {{
1027
+ displayText = tokenText || '';
1028
+ const escapedText = displayText
1029
+ .replace(/&/g, '&amp;')
1030
+ .replace(/</g, '&lt;')
1031
+ .replace(/>/g, '&gt;');
1032
+ htmlText = '<span class="esc-raw">' + escapedText + '</span>';
1033
+ }} else {{
1034
+ const visible = escapeControlChars(tokenText || '');
1035
+ displayText = (visible !== '') ? visible : '';
1036
+ htmlText = renderEscapedWithControlColor(displayText);
1037
+ }}
1038
  html += '<span class="token-chip-group" title="ID: ' + tokenId + '">';
1039
  html += '<span class="token-id">[' + tokenId + ']</span>';
1040
+ html += '<span class="topk-token token-chip">' + htmlText + '</span>';
1041
  html += '</span>';
1042
  }});
1043
  html += '</div></div>';