ModerRAS commited on
Commit
359ff82
·
1 Parent(s): be6a29a

Train virtual-shard anime parser

Browse files
.gitignore CHANGED
@@ -18,3 +18,5 @@ data/**/*.db
18
  data/**/*.sqlite
19
  data/generated/
20
  reports/generated/
 
 
 
18
  data/**/*.sqlite
19
  data/generated/
20
  reports/generated/
21
+ target/
22
+ **/target/
README.md CHANGED
@@ -26,7 +26,7 @@ model-index:
26
  metrics:
27
  - type: accuracy
28
  name: Fixed parser model-only full-match accuracy
29
- value: 0.9615
30
  - type: accuracy
31
  name: Fixed parser thin-runtime full-match accuracy
32
  value: 1.0
@@ -140,17 +140,17 @@ Current published checkpoint:
140
 
141
  | Metric / 指标 | Value / 数值 |
142
  | --- | --- |
143
- | Fixed regression, model-only / 固定回归,纯模型聚合 | 25/26 full match = `96.15%` |
144
  | Fixed regression, default thin runtime / 固定回归,默认薄层运行时 | 26/26 full match = `100%` |
145
- | Held-out parse, model-only / held-out 解析,纯模型聚合 | 1947/2048 full match = `95.07%` |
146
- | Held-out parse, default thin runtime / held-out 解析,默认薄层运行时 | 1966/2048 full match = `96.00%` |
147
- | Token/entity eval / token/entity 评估 | F1 `0.9847`, token accuracy `0.9962` |
148
- | ONNX parity / ONNX 误差 | max abs diff `1.9073e-05` |
149
- | CPU thin-runtime latency / CPU 薄层运行时延迟 | ONNX avg `11.61 ms`, P95 `13.52 ms` |
150
 
151
- **中文**:当前发布模型是“两阶段训练”产物:先 `datasets/AnimeName/dmhy_weak_char.jsonl` 上做 10 epoch CUDA 全量重训训练时动态生成不完整文件名、BIO 块重排/子集和 special 片段样本;再 thin hard-case focus 微调。细节见 `reports/training_lineage.json`。README 主指标以 `model-only` 和默认薄层 `normalized-only` 为准;旧版结构规则辅助层已移除,不再作为运行时或质量对照。
152
 
153
- **English**: The published checkpoint was trained in two stages: a 10-epoch CUDA fine-tune on `datasets/AnimeName/dmhy_weak_char.jsonl` with dynamic in-memory augmentation for incomplete filenames, BIO-block subsets/permutations, and special-code fragments, followed by a thin hard-case focus fine-tune. See `reports/training_lineage.json` for details. README quality numbers prioritize `model-only` and the default thin `normalized-only` runtime; structural filename assists have been removed from the runtime and quality reports.
154
 
155
  Run regression:
156
 
@@ -177,8 +177,8 @@ decoding, entity aggregation, and light string/number normalization:
177
 
178
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
179
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
180
- | PyTorch | 44.84 | 16.42 | 14.77 | 26.31 | 32.62 | 60.9 |
181
- | ONNX Runtime | 40.70 | 11.61 | 11.43 | 13.52 | 15.20 | 86.2 |
182
 
183
  **中文**:这是完整薄层 parser 的端到端延迟,不是只测模型 forward。移动端实现应复用 ONNX session,并保持 tokenizer/BIO/薄规范化逻辑一致。
184
 
@@ -188,33 +188,58 @@ decoding, entity aggregation, and light string/number normalization:
188
 
189
  Training uses the dataset submodule at `datasets/AnimeName`.
190
 
191
- Recommended optimized character-token run on the Windows RTX 5070 Ti worker:
192
 
193
  ```powershell
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
194
  .\.venv\Scripts\python.exe -m anifilebert.train --tokenizer char `
195
  --data-file datasets/AnimeName/dmhy_weak_char.jsonl `
196
- --vocab-file vocab.json `
197
- --save-dir checkpoints/dmhy-char-aug-fragments-optimized-10epoch `
 
198
  --init-model-dir . `
199
  --epochs 10 `
200
  --batch-size 1792 `
201
- --learning-rate 0.00002 `
202
- --warmup-steps 500 `
203
  --max-seq-length 128 `
204
  --train-split 0.98 `
205
- --num-workers 0 `
206
- --checkpoint-steps 1000 `
 
 
207
  --save-total-limit 3 `
208
  --parse-eval-limit 2048 `
209
  --case-eval-file data/parser_regression_cases.json `
210
- --augment-partial-samples 200000 `
211
- --augment-permutation-samples 400000 `
212
- --augment-special-samples 80000 `
213
  --bf16 `
214
  --no-periodic-eval `
215
- --perf-log-steps 200 `
 
216
  --seed 105 `
217
- --experiment-name dmhy-char-aug-fragments-optimized-10epoch
218
  ```
219
 
220
  `python -m anifilebert.train` writes:
 
26
  metrics:
27
  - type: accuracy
28
  name: Fixed parser model-only full-match accuracy
29
+ value: 0.9231
30
  - type: accuracy
31
  name: Fixed parser thin-runtime full-match accuracy
32
  value: 1.0
 
140
 
141
  | Metric / 指标 | Value / 数值 |
142
  | --- | --- |
143
+ | Fixed regression, model-only / 固定回归,纯模型聚合 | 24/26 full match = `92.31%` |
144
  | Fixed regression, default thin runtime / 固定回归,默认薄层运行时 | 26/26 full match = `100%` |
145
+ | Held-out parse, model-only / held-out 解析,纯模型聚合 | 1962/2048 full match = `95.80%` |
146
+ | Held-out parse, default thin runtime / held-out 解析,默认薄层运行时 | 1988/2048 full match = `97.07%` |
147
+ | Token/entity eval / token/entity 评估 | F1 `0.9844`, token accuracy `0.9961` |
148
+ | ONNX parity / ONNX 误差 | max abs diff `4.0054e-05` |
149
+ | CPU thin-runtime latency / CPU 薄层运行时延迟 | ONNX avg `12.04 ms`, P95 `13.81 ms` |
150
 
151
+ **中文**:当前发布模型是“两阶段训练”产物:先 Rust 预生成 `20,439,848` 行虚拟 BIO shard,在 RTX 5070 Ti 上完整训练 10 epoch / `114,070` optimizer steps;再 1 epoch light hard-case focus 微调。细节见 `reports/training_lineage.json`。README 主指标以 `model-only` 和默认薄层 `normalized-only` 为准;旧版结构规则辅助层已移除,不再作为运行时或质量对照。
152
 
153
+ **English**: The published checkpoint was trained in two stages: a 10-epoch CUDA fine-tune over `20,439,848` Rust-generated virtual BIO shard rows (`114,070` optimizer steps) on the RTX 5070 Ti, followed by a 1-epoch light hard-case focus fine-tune. See `reports/training_lineage.json` for details. README quality numbers prioritize `model-only` and the default thin `normalized-only` runtime; structural filename assists have been removed from the runtime and quality reports.
154
 
155
  Run regression:
156
 
 
177
 
178
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
179
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
180
+ | PyTorch | 46.35 | 15.36 | 14.25 | 22.27 | 29.75 | 65.1 |
181
+ | ONNX Runtime | 50.92 | 12.04 | 11.90 | 13.81 | 15.38 | 83.1 |
182
 
183
  **中文**:这是完整薄层 parser 的端到端延迟,不是只测模型 forward。移动端实现应复用 ONNX session,并保持 tokenizer/BIO/薄规范化逻辑一致。
184
 
 
188
 
189
  Training uses the dataset submodule at `datasets/AnimeName`.
190
 
191
+ Recommended virtual-shard character-token run on the Windows RTX 5070 Ti worker:
192
 
193
  ```powershell
194
+ @'
195
+ import random
196
+ from pathlib import Path
197
+
198
+ source = Path("datasets/AnimeName/dmhy_weak_char.jsonl")
199
+ target = Path("data/generated/virtual_source_train_seed105.jsonl")
200
+ rows = [line for line in source.read_text(encoding="utf-8").splitlines() if line]
201
+ random.Random(105).shuffle(rows)
202
+ target.parent.mkdir(parents=True, exist_ok=True)
203
+ target.write_text("\n".join(rows[: int(len(rows) * 0.98)]) + "\n", encoding="utf-8")
204
+ '@ | .\.venv\Scripts\python.exe -
205
+
206
+ cargo build --release --manifest-path tools/virtual_dataset_generator/Cargo.toml
207
+ .\tools\virtual_dataset_generator\target\release\anifilebert-virtual-dataset-generator.exe `
208
+ --input data/generated/virtual_source_train_seed105.jsonl `
209
+ --vocab-file datasets/AnimeName/vocab.char.json `
210
+ --output-dir data/generated/virtual_char_sps32_seed105 `
211
+ --max-length 128 `
212
+ --samples-per-source 32 `
213
+ --seed 105 `
214
+ --threads 20 `
215
+ --separator-mode per-gap `
216
+ --bracket-mode per-part
217
+
218
  .\.venv\Scripts\python.exe -m anifilebert.train --tokenizer char `
219
  --data-file datasets/AnimeName/dmhy_weak_char.jsonl `
220
+ --vocab-file datasets/AnimeName/vocab.char.json `
221
+ --virtual-dataset-dir data/generated/virtual_char_sps32_seed105 `
222
+ --save-dir checkpoints/dmhy-char-virtual-sps32-10epoch-lr1e5 `
223
  --init-model-dir . `
224
  --epochs 10 `
225
  --batch-size 1792 `
226
+ --learning-rate 0.00001 `
227
+ --warmup-steps 2000 `
228
  --max-seq-length 128 `
229
  --train-split 0.98 `
230
+ --num-workers 4 `
231
+ --prefetch-factor 4 `
232
+ --persistent-workers `
233
+ --checkpoint-steps 5000 `
234
  --save-total-limit 3 `
235
  --parse-eval-limit 2048 `
236
  --case-eval-file data/parser_regression_cases.json `
 
 
 
237
  --bf16 `
238
  --no-periodic-eval `
239
+ --perf-log-steps 1000 `
240
+ --perf-sample-interval 0.5 `
241
  --seed 105 `
242
+ --experiment-name dmhy-char-virtual-sps32-10epoch-lr1e5
243
  ```
244
 
245
  `python -m anifilebert.train` writes:
anifilebert/inference.py CHANGED
@@ -41,6 +41,15 @@ STANDALONE_SPECIAL_RE = re.compile(
41
  re.I,
42
  )
43
 
 
 
 
 
 
 
 
 
 
44
 
45
  def extract_season_number(text: str) -> Optional[int]:
46
  """
@@ -132,6 +141,24 @@ def normalize_standalone_special(text: str) -> Optional[str]:
132
  return special if STANDALONE_SPECIAL_RE.fullmatch(special) else None
133
 
134
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
  def display_token(token: str) -> str:
136
  """Make whitespace tokens visible in debug output."""
137
  if token == " ":
@@ -324,6 +351,19 @@ def postprocess(
324
  result["source"] = choose_thin_source(grouped_entities.get("SOURCE", []))
325
 
326
  whole_text = join_entity_tokens(tokens, tokenizer)
 
 
 
 
 
 
 
 
 
 
 
 
 
327
  standalone_special = normalize_standalone_special(whole_text)
328
  if standalone_special is not None:
329
  result.update(
 
41
  re.I,
42
  )
43
 
44
+ BRACKETED_SEARCH_SPECIAL_RE = re.compile(
45
+ r"[\[【((]\s*((?:檢索|检索|検索)\s*[::][^\]】))]+?)\s*[\]】))]"
46
+ )
47
+
48
+ NEW_SHOW_BRACKET_TITLE_RE = re.compile(
49
+ r"[★☆][^★☆\[\]【】()()]{0,24}(?:新番|月番)[^★☆\[\]【】()()]{0,24}[★☆]"
50
+ r"\s*[\[【((]\s*([^\]】))]+?)\s*[\]】))]"
51
+ )
52
+
53
 
54
  def extract_season_number(text: str) -> Optional[int]:
55
  """
 
141
  return special if STANDALONE_SPECIAL_RE.fullmatch(special) else None
142
 
143
 
144
+ def extract_bracketed_search_special(text: str) -> Optional[str]:
145
+ """Return bracketed search-note tags such as [檢索:...]."""
146
+ for match in BRACKETED_SEARCH_SPECIAL_RE.finditer(text):
147
+ special = normalize_field_text(match.group(1))
148
+ if special:
149
+ return special
150
+ return None
151
+
152
+
153
+ def extract_new_show_bracket_title(text: str) -> Optional[str]:
154
+ """Return title from release-promo layouts like ★04月新番★[葬送的芙莉莲]."""
155
+ for match in NEW_SHOW_BRACKET_TITLE_RE.finditer(text):
156
+ title = normalize_field_text(match.group(1))
157
+ if title:
158
+ return title
159
+ return None
160
+
161
+
162
  def display_token(token: str) -> str:
163
  """Make whitespace tokens visible in debug output."""
164
  if token == " ":
 
351
  result["source"] = choose_thin_source(grouped_entities.get("SOURCE", []))
352
 
353
  whole_text = join_entity_tokens(tokens, tokenizer)
354
+ new_show_title = extract_new_show_bracket_title(whole_text)
355
+ if new_show_title is not None and (
356
+ result["title"] is None
357
+ or result["title"].startswith(("★", "☆"))
358
+ or "新番" in result["title"]
359
+ or "月番" in result["title"]
360
+ ):
361
+ result["title"] = new_show_title
362
+
363
+ search_special = extract_bracketed_search_special(whole_text)
364
+ if search_special is not None:
365
+ result["special"] = search_special
366
+
367
  standalone_special = normalize_standalone_special(whole_text)
368
  if standalone_special is not None:
369
  result.update(
anifilebert/train.py CHANGED
@@ -22,6 +22,7 @@ from typing import Dict, List, Optional, Sequence
22
 
23
  import numpy as np
24
  import torch
 
25
  from transformers import (
26
  Trainer,
27
  TrainingArguments,
@@ -35,6 +36,7 @@ from .tokenizer import AnimeTokenizer, create_tokenizer, load_tokenizer
35
  from .model import create_model, print_model_summary, count_parameters
36
  from .dataset import AnimeItemsDataset, EncodedAnimeDataset, labels_for_tokenizer
37
  from .inference import parse_filename, postprocess
 
38
 
39
 
40
  def compute_metrics(p):
@@ -76,6 +78,8 @@ def parse_args() -> argparse.Namespace:
76
  help="Additional training JSONL file. Can be passed multiple times.")
77
  parser.add_argument("--extra-data-repeat", type=int, default=1,
78
  help="Repeat each extra dataset this many times after loading")
 
 
79
  parser.add_argument("--vocab-file", default=None,
80
  help="Tokenizer vocab JSON. Defaults to data/vocab.json or data/vocab.char.json")
81
  parser.add_argument("--save-dir", default=None, help="Checkpoint output directory")
@@ -853,10 +857,27 @@ class FastTokenClassificationCollator:
853
  """Stack already padded token-classification tensors without extra work."""
854
 
855
  def __call__(self, features: List[Dict[str, torch.Tensor]]) -> Dict[str, torch.Tensor]:
856
- return {
857
  key: torch.stack([feature[key] for feature in features])
858
  for key in features[0].keys()
859
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
860
 
861
 
862
  def augment_training_data(
@@ -1264,7 +1285,28 @@ def main():
1264
  eval_data = all_data[split_idx:]
1265
 
1266
  encode_started_at = time.perf_counter()
1267
- if args.lazy_dataset:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1268
  train_dataset = AnimeItemsDataset(
1269
  data=train_data,
1270
  tokenizer=tokenizer,
@@ -1381,7 +1423,7 @@ def main():
1381
  log_steps=args.perf_log_steps,
1382
  sample_interval=args.perf_sample_interval,
1383
  )
1384
- trainer = Trainer(
1385
  model=model,
1386
  args=training_args,
1387
  train_dataset=train_dataset,
@@ -1418,6 +1460,7 @@ def main():
1418
  "data_sources": data_sources,
1419
  "augmentation": augmentation_metadata,
1420
  "dataset_mode": dataset_mode,
 
1421
  "apply_label_repairs": args.apply_label_repairs,
1422
  "keep_raw_dataset": args.keep_raw_dataset,
1423
  "tokenizer_variant": tokenizer_variant,
 
22
 
23
  import numpy as np
24
  import torch
25
+ from torch.utils.data import SequentialSampler
26
  from transformers import (
27
  Trainer,
28
  TrainingArguments,
 
36
  from .model import create_model, print_model_summary, count_parameters
37
  from .dataset import AnimeItemsDataset, EncodedAnimeDataset, labels_for_tokenizer
38
  from .inference import parse_filename, postprocess
39
+ from .virtual_dataset import DatasetRangeView, ShardedEncodedDataset
40
 
41
 
42
  def compute_metrics(p):
 
78
  help="Additional training JSONL file. Can be passed multiple times.")
79
  parser.add_argument("--extra-data-repeat", type=int, default=1,
80
  help="Repeat each extra dataset this many times after loading")
81
+ parser.add_argument("--virtual-dataset-dir", default=None,
82
+ help="Pre-encoded shard directory generated by tools/virtual_dataset_generator")
83
  parser.add_argument("--vocab-file", default=None,
84
  help="Tokenizer vocab JSON. Defaults to data/vocab.json or data/vocab.char.json")
85
  parser.add_argument("--save-dir", default=None, help="Checkpoint output directory")
 
857
  """Stack already padded token-classification tensors without extra work."""
858
 
859
  def __call__(self, features: List[Dict[str, torch.Tensor]]) -> Dict[str, torch.Tensor]:
860
+ batch = {
861
  key: torch.stack([feature[key] for feature in features])
862
  for key in features[0].keys()
863
  }
864
+ if "input_ids" in batch:
865
+ batch["input_ids"] = batch["input_ids"].long()
866
+ if "labels" in batch:
867
+ batch["labels"] = batch["labels"].long()
868
+ if "attention_mask" in batch:
869
+ batch["attention_mask"] = batch["attention_mask"].to(dtype=torch.bool)
870
+ return batch
871
+
872
+
873
+ class OrderedTrainer(Trainer):
874
+ """Trainer variant that preserves pre-shuffled order for virtual datasets."""
875
+
876
+ def _get_train_sampler(self, train_dataset=None):
877
+ dataset = train_dataset if train_dataset is not None else self.train_dataset
878
+ if getattr(dataset, "preserve_order", False):
879
+ return SequentialSampler(dataset)
880
+ return super()._get_train_sampler(train_dataset)
881
 
882
 
883
  def augment_training_data(
 
1285
  eval_data = all_data[split_idx:]
1286
 
1287
  encode_started_at = time.perf_counter()
1288
+ if args.virtual_dataset_dir:
1289
+ virtual_dataset = ShardedEncodedDataset(args.virtual_dataset_dir)
1290
+ if virtual_dataset.max_length != config.max_seq_length:
1291
+ raise ValueError(
1292
+ f"Virtual dataset max_length {virtual_dataset.max_length} does not match "
1293
+ f"configured max_seq_length {config.max_seq_length}"
1294
+ )
1295
+ train_dataset = virtual_dataset
1296
+ eval_dataset = EncodedAnimeDataset(
1297
+ data=eval_data,
1298
+ tokenizer=tokenizer,
1299
+ label2id=config.label2id,
1300
+ max_length=config.max_seq_length,
1301
+ device=torch.device("cpu"),
1302
+ apply_label_repairs=args.apply_label_repairs,
1303
+ )
1304
+ dataset_mode = "virtual-sharded"
1305
+ if not args.keep_raw_dataset:
1306
+ train_data = []
1307
+ all_data = []
1308
+ gc.collect()
1309
+ elif args.lazy_dataset:
1310
  train_dataset = AnimeItemsDataset(
1311
  data=train_data,
1312
  tokenizer=tokenizer,
 
1423
  log_steps=args.perf_log_steps,
1424
  sample_interval=args.perf_sample_interval,
1425
  )
1426
+ trainer = OrderedTrainer(
1427
  model=model,
1428
  args=training_args,
1429
  train_dataset=train_dataset,
 
1460
  "data_sources": data_sources,
1461
  "augmentation": augmentation_metadata,
1462
  "dataset_mode": dataset_mode,
1463
+ "virtual_dataset_dir": args.virtual_dataset_dir,
1464
  "apply_label_repairs": args.apply_label_repairs,
1465
  "keep_raw_dataset": args.keep_raw_dataset,
1466
  "tokenizer_variant": tokenizer_variant,
anifilebert/virtual_dataset.py ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Pre-encoded sharded datasets generated by the Rust virtual dataset tool."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import bisect
6
+ import json
7
+ from pathlib import Path
8
+ from typing import Dict, List, Optional
9
+
10
+ import numpy as np
11
+ import torch
12
+ from torch.utils.data import Dataset
13
+
14
+
15
+ class ShardedEncodedDataset(Dataset):
16
+ """Map-style dataset backed by pre-encoded `.npy` shards.
17
+
18
+ The Rust generator writes compact uint16/int16/u8 arrays. This class loads
19
+ one shard at a time and relies on sequential sampling over pre-shuffled
20
+ shards, so Python does no tokenization or BIO permutation during training.
21
+ """
22
+
23
+ preserve_order = True
24
+
25
+ def __init__(self, dataset_dir: str | Path, manifest_name: str = "manifest.json"):
26
+ self.dataset_dir = Path(dataset_dir)
27
+ self.manifest_path = self.dataset_dir / manifest_name
28
+ self.manifest = json.loads(self.manifest_path.read_text(encoding="utf-8"))
29
+ if self.manifest.get("format") != "anifilebert.virtual_dataset.shards.v1":
30
+ raise ValueError(f"Unsupported virtual dataset manifest: {self.manifest_path}")
31
+
32
+ self.max_length = int(self.manifest["max_length"])
33
+ self.shards: List[Dict] = list(self.manifest.get("shards") or [])
34
+ if not self.shards:
35
+ raise ValueError(f"Virtual dataset has no shards: {self.manifest_path}")
36
+
37
+ self._starts: List[int] = []
38
+ total = 0
39
+ for shard in self.shards:
40
+ self._starts.append(total)
41
+ total += int(shard["rows"])
42
+ self.total_rows = total
43
+
44
+ declared_total = int(self.manifest.get("total_rows", total))
45
+ if declared_total != total:
46
+ raise ValueError(
47
+ f"Virtual dataset row count mismatch: manifest total_rows={declared_total}, "
48
+ f"shard rows={total}"
49
+ )
50
+
51
+ self._cache_index: Optional[int] = None
52
+ self._cache: Optional[Dict[str, np.ndarray]] = None
53
+
54
+ def __len__(self) -> int:
55
+ return self.total_rows
56
+
57
+ def __getitem__(self, idx: int) -> Dict[str, torch.Tensor]:
58
+ if idx < 0:
59
+ idx += self.total_rows
60
+ if idx < 0 or idx >= self.total_rows:
61
+ raise IndexError(idx)
62
+
63
+ shard_idx = bisect.bisect_right(self._starts, idx) - 1
64
+ shard_start = self._starts[shard_idx]
65
+ row_idx = idx - shard_start
66
+ cache = self._load_shard(shard_idx)
67
+ return {
68
+ "input_ids": torch.from_numpy(cache["input_ids"][row_idx]),
69
+ "attention_mask": torch.from_numpy(cache["attention_mask"][row_idx]),
70
+ "labels": torch.from_numpy(cache["labels"][row_idx]),
71
+ }
72
+
73
+ def _load_shard(self, shard_idx: int) -> Dict[str, np.ndarray]:
74
+ if self._cache_index == shard_idx and self._cache is not None:
75
+ return self._cache
76
+
77
+ shard = self.shards[shard_idx]
78
+ cache = {
79
+ "input_ids": np.load(self.dataset_dir / shard["input_ids"], allow_pickle=False),
80
+ "attention_mask": np.load(self.dataset_dir / shard["attention_mask"], allow_pickle=False),
81
+ "labels": np.load(self.dataset_dir / shard["labels"], allow_pickle=False),
82
+ }
83
+ expected_shape = (int(shard["rows"]), self.max_length)
84
+ for key, array in cache.items():
85
+ if array.shape != expected_shape:
86
+ raise ValueError(
87
+ f"Shard {shard_idx} {key} has shape {array.shape}, expected {expected_shape}"
88
+ )
89
+ self._cache_index = shard_idx
90
+ self._cache = cache
91
+ return cache
92
+
93
+
94
+ class DatasetRangeView(Dataset):
95
+ """A contiguous range view over another dataset."""
96
+
97
+ preserve_order = True
98
+
99
+ def __init__(self, dataset: Dataset, start: int, end: int):
100
+ if start < 0 or end < start or end > len(dataset):
101
+ raise ValueError(f"Invalid dataset range [{start}, {end}) for length {len(dataset)}")
102
+ self.dataset = dataset
103
+ self.start = start
104
+ self.end = end
105
+
106
+ def __len__(self) -> int:
107
+ return self.end - self.start
108
+
109
+ def __getitem__(self, idx: int):
110
+ if idx < 0:
111
+ idx += len(self)
112
+ if idx < 0 or idx >= len(self):
113
+ raise IndexError(idx)
114
+ return self.dataset[self.start + idx]
docs/maintenance.md CHANGED
@@ -73,27 +73,14 @@ For full details, see [`training.md`](training.md).
73
 
74
  完整流程见 [`training.md`](training.md)。
75
 
76
- Recommended full training command / 推荐全量训练命令:
 
 
77
 
78
  ```powershell
79
- uv run python -m anifilebert.train --tokenizer char `
80
- --data-file datasets/AnimeName/dmhy_weak_char.jsonl `
81
- --vocab-file datasets/AnimeName/vocab.char.json `
82
- --save-dir checkpoints/dmhy-char-full `
83
- --init-model-dir . `
84
- --epochs 2 `
85
- --batch-size 256 `
86
- --learning-rate 0.00008 `
87
- --warmup-steps 300 `
88
- --max-seq-length 128 `
89
- --train-split 0.98 `
90
- --num-workers 4 `
91
- --checkpoint-steps 1000 `
92
- --save-total-limit 3 `
93
- --parse-eval-limit 2048 `
94
- --case-eval-file data/parser_regression_cases.json `
95
- --seed 52 `
96
- --experiment-name dmhy-char-full
97
  ```
98
 
99
  ## Publish a New Checkpoint / 发布新 checkpoint
@@ -103,7 +90,7 @@ Copy final files to the repository root:
103
  把 `final` 文件复制到仓库根目录:
104
 
105
  ```powershell
106
- $final = "checkpoints/dmhy-char-aug-fragments-10epoch-hardfocus/final"
107
  Copy-Item "$final/config.json" . -Force
108
  Copy-Item "$final/model.safetensors" . -Force
109
  Copy-Item "$final/tokenizer_config.json" . -Force
 
73
 
74
  完整流程见 [`training.md`](training.md)。
75
 
76
+ Current release training uses the virtual-shard flow in [`training.md`](training.md):
77
+
78
+ 当前发布训练使用 [`training.md`](training.md) 中的 virtual-shard 流程:
79
 
80
  ```powershell
81
+ uv run python -m compileall -q anifilebert tools
82
+ cargo build --release --manifest-path tools/virtual_dataset_generator/Cargo.toml
83
+ # Then follow docs/training.md section "Full Training with Virtual BIO Shards".
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
  ```
85
 
86
  ## Publish a New Checkpoint / 发布新 checkpoint
 
90
  把 `final` 文件复制到仓库根目录:
91
 
92
  ```powershell
93
+ $final = "checkpoints/dmhy-char-virtual-sps32-10epoch-lightfocus/final"
94
  Copy-Item "$final/config.json" . -Force
95
  Copy-Item "$final/model.safetensors" . -Force
96
  Copy-Item "$final/tokenizer_config.json" . -Force
docs/onnx.md CHANGED
@@ -175,8 +175,8 @@ the default thin runtime:
175
 
176
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
177
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
178
- | PyTorch | 49.07 | 15.16 | 14.87 | 18.50 | 21.91 | 66.0 |
179
- | ONNX Runtime | 568.85 | 13.08 | 12.82 | 15.95 | 20.19 | 76.5 |
180
 
181
  The benchmark includes tokenization, model/session forward, constrained BIO
182
  decode, entity aggregation, and thin normalization. It does not include
 
175
 
176
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
177
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
178
+ | PyTorch | 46.35 | 15.36 | 14.25 | 22.27 | 29.75 | 65.1 |
179
+ | ONNX Runtime | 50.92 | 12.04 | 11.90 | 13.81 | 15.38 | 83.1 |
180
 
181
  The benchmark includes tokenization, model/session forward, constrained BIO
182
  decode, entity aggregation, and thin normalization. It does not include
docs/training.md CHANGED
@@ -20,9 +20,9 @@ Recommended GPU configuration:
20
  推荐 GPU 配置:
21
 
22
  - RTX 3080 class GPU or better; current release training used an RTX 5070 Ti
23
- - batch size `1792` with the encoded dataset path on the 5070 Ti
24
  - `bf16`/TF32 on Ada/Blackwell-class CUDA devices when available
25
- - `--num-workers 0` with the encoded dataset path, because samples are pre-encoded into tensors
26
 
27
  ## 2. Dataset / 数据集
28
 
@@ -88,48 +88,74 @@ uv run python -m tools.convert_to_char_dataset `
88
  --progress 50000
89
  ```
90
 
91
- ## 5. Full Training with Dynamic Augmentation / 动态增强全量训练
92
 
93
  Recommended RTX 5070 Ti run:
94
 
95
  推荐 RTX 5070 Ti 训练命令:
96
 
97
  ```powershell
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
  .\.venv\Scripts\python.exe -m anifilebert.train --tokenizer char `
99
  --data-file datasets/AnimeName/dmhy_weak_char.jsonl `
100
- --vocab-file vocab.json `
101
- --save-dir checkpoints/dmhy-char-aug-fragments-optimized-10epoch `
 
102
  --init-model-dir . `
103
  --epochs 10 `
104
  --batch-size 1792 `
105
- --learning-rate 0.00002 `
106
- --warmup-steps 500 `
107
  --max-seq-length 128 `
108
  --train-split 0.98 `
109
- --num-workers 0 `
110
- --checkpoint-steps 1000 `
 
 
111
  --save-total-limit 3 `
112
  --parse-eval-limit 2048 `
113
  --case-eval-file data/parser_regression_cases.json `
114
- --augment-partial-samples 200000 `
115
- --augment-permutation-samples 400000 `
116
- --augment-special-samples 80000 `
117
  --bf16 `
118
  --no-periodic-eval `
119
- --perf-log-steps 200 `
 
120
  --seed 105 `
121
- --experiment-name dmhy-char-aug-fragments-optimized-10epoch
122
  ```
123
 
124
- Dynamic augmentation is generated in memory from BIO-labeled source rows and
125
- does not modify the authoritative DMHY JSONL files. The current release used
126
- partial/incomplete filename fragments, BIO entity block subsets and
127
- permutations, title-only/title+season directory-style examples, and standalone
128
- special fragments such as `Menu01`, `OP02`, `ED E07`, and `NCED03`.
129
 
130
- 动态增强从已有 BIO 标注行内存生成,不修改权威 DMHY JSONL。当前发布使用了
131
- 不完整文件名片段、BIO 实体块子集和重排、只有 title title+season 的目录样式
132
- 样本,以及 `Menu01`、`OP02`、`ED E07`、`NCED03` standalone special 片段。
 
133
 
134
  Training outputs:
135
 
@@ -156,21 +182,21 @@ been confirmed, fixed in the weak labels, and added to
156
  ```powershell
157
  uv run python -m tools.build_repair_focus_dataset `
158
  --input datasets/AnimeName/dmhy_weak_char.jsonl `
159
- --output data/generated/focus_after_10epoch_char.jsonl `
160
- --context-samples 100000 `
161
  --repeat-focus 3 `
162
- --repeat-manual 400 `
163
- --seed 106
164
 
165
  .\.venv\Scripts\python.exe -m anifilebert.train --tokenizer char `
166
- --data-file data/generated/focus_after_10epoch_char.jsonl `
167
- --vocab-file vocab.json `
168
- --save-dir checkpoints/dmhy-char-aug-fragments-10epoch-hardfocus `
169
- --init-model-dir checkpoints/dmhy-char-aug-fragments-optimized-10epoch/final `
170
- --epochs 2 `
171
  --batch-size 1792 `
172
- --learning-rate 0.000008 `
173
- --warmup-steps 50 `
174
  --max-seq-length 128 `
175
  --train-split 0.95 `
176
  --num-workers 0 `
@@ -178,14 +204,12 @@ uv run python -m tools.build_repair_focus_dataset `
178
  --save-total-limit 2 `
179
  --parse-eval-limit 2048 `
180
  --case-eval-file data/parser_regression_cases.json `
181
- --augment-partial-samples 30000 `
182
- --augment-permutation-samples 60000 `
183
- --augment-special-samples 20000 `
184
  --bf16 `
185
  --no-periodic-eval `
186
  --perf-log-steps 50 `
187
- --seed 107 `
188
- --experiment-name dmhy-char-aug-fragments-10epoch-hardfocus
 
189
  ```
190
 
191
  The default quality gate is model-led parsing:
@@ -209,7 +233,7 @@ The repository root is the Hugging Face checkpoint surface.
209
  仓库根目录就是 Hugging Face checkpoint 发布面。
210
 
211
  ```powershell
212
- $final = "checkpoints/dmhy-char-aug-fragments-10epoch-hardfocus/final"
213
  Copy-Item "$final/config.json" . -Force
214
  Copy-Item "$final/model.safetensors" . -Force
215
  Copy-Item "$final/tokenizer_config.json" . -Force
@@ -239,7 +263,7 @@ Run these before committing:
239
  提交前执行:
240
 
241
  ```powershell
242
- uv run python -m py_compile anifilebert/*.py tools/*.py
243
  uv run python -m tools.evaluate_parser_cases --model-dir . --case-file data/parser_regression_cases.json --output reports/case_metrics.json
244
  uv run python -m anifilebert.inference --model-dir . "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4"
245
  uv run python -m tools.onnx_inference "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv"
 
20
  推荐 GPU 配置:
21
 
22
  - RTX 3080 class GPU or better; current release training used an RTX 5070 Ti
23
+ - batch size `1792` with the virtual-sharded dataset path on the 5070 Ti
24
  - `bf16`/TF32 on Ada/Blackwell-class CUDA devices when available
25
+ - `--num-workers 4 --persistent-workers` with the virtual-sharded dataset path
26
 
27
  ## 2. Dataset / 数据集
28
 
 
88
  --progress 50000
89
  ```
90
 
91
+ ## 5. Full Training with Virtual BIO Shards / 虚拟 BIO shard 全量训练
92
 
93
  Recommended RTX 5070 Ti run:
94
 
95
  推荐 RTX 5070 Ti 训练命令:
96
 
97
  ```powershell
98
+ @'
99
+ import random
100
+ from pathlib import Path
101
+
102
+ source = Path("datasets/AnimeName/dmhy_weak_char.jsonl")
103
+ target = Path("data/generated/virtual_source_train_seed105.jsonl")
104
+ rows = [line for line in source.read_text(encoding="utf-8").splitlines() if line]
105
+ random.Random(105).shuffle(rows)
106
+ target.parent.mkdir(parents=True, exist_ok=True)
107
+ target.write_text("\n".join(rows[: int(len(rows) * 0.98)]) + "\n", encoding="utf-8")
108
+ '@ | .\.venv\Scripts\python.exe -
109
+
110
+ cargo build --release --manifest-path tools/virtual_dataset_generator/Cargo.toml
111
+ .\tools\virtual_dataset_generator\target\release\anifilebert-virtual-dataset-generator.exe `
112
+ --input data/generated/virtual_source_train_seed105.jsonl `
113
+ --vocab-file datasets/AnimeName/vocab.char.json `
114
+ --output-dir data/generated/virtual_char_sps32_seed105 `
115
+ --max-length 128 `
116
+ --samples-per-source 32 `
117
+ --seed 105 `
118
+ --threads 20 `
119
+ --separator-mode per-gap `
120
+ --bracket-mode per-part
121
+
122
  .\.venv\Scripts\python.exe -m anifilebert.train --tokenizer char `
123
  --data-file datasets/AnimeName/dmhy_weak_char.jsonl `
124
+ --vocab-file datasets/AnimeName/vocab.char.json `
125
+ --virtual-dataset-dir data/generated/virtual_char_sps32_seed105 `
126
+ --save-dir checkpoints/dmhy-char-virtual-sps32-10epoch-lr1e5 `
127
  --init-model-dir . `
128
  --epochs 10 `
129
  --batch-size 1792 `
130
+ --learning-rate 0.00001 `
131
+ --warmup-steps 2000 `
132
  --max-seq-length 128 `
133
  --train-split 0.98 `
134
+ --num-workers 4 `
135
+ --prefetch-factor 4 `
136
+ --persistent-workers `
137
+ --checkpoint-steps 5000 `
138
  --save-total-limit 3 `
139
  --parse-eval-limit 2048 `
140
  --case-eval-file data/parser_regression_cases.json `
 
 
 
141
  --bf16 `
142
  --no-periodic-eval `
143
+ --perf-log-steps 1000 `
144
+ --perf-sample-interval 0.5 `
145
  --seed 105 `
146
+ --experiment-name dmhy-char-virtual-sps32-10epoch-lr1e5
147
  ```
148
 
149
+ The Rust generator samples BIO entity block subsets/permutations, separator
150
+ variants, bracket styles, incomplete filename fragments, and standalone special
151
+ fixtures into compact pre-encoded `.npy` shards. The current release generated
152
+ `20,439,848` training rows from `619,361` train-split source rows plus `935`
153
+ special fixtures, then trained for 10 epochs / `114,070` optimizer steps.
154
 
155
+ Rust 生成 BIO 实体块子集/重排、分隔符变体、括号样式、不完整文件名片段、
156
+ 以及 standalone special fixtures 预编码成紧凑 `.npy` shard。当前发布从 `619,361`
157
+ 条 train split 源样本 `935` special fixture 生成了 `20,439,848` 条训练行,
158
+ 并完整训练 10 epoch / `114,070` 个 optimizer steps。
159
 
160
  Training outputs:
161
 
 
182
  ```powershell
183
  uv run python -m tools.build_repair_focus_dataset `
184
  --input datasets/AnimeName/dmhy_weak_char.jsonl `
185
+ --output data/generated/focus_after_virtual_sps32_char.jsonl `
186
+ --context-samples 50000 `
187
  --repeat-focus 3 `
188
+ --repeat-manual 96 `
189
+ --seed 205
190
 
191
  .\.venv\Scripts\python.exe -m anifilebert.train --tokenizer char `
192
+ --data-file data/generated/focus_after_virtual_sps32_char.jsonl `
193
+ --vocab-file datasets/AnimeName/vocab.char.json `
194
+ --save-dir checkpoints/dmhy-char-virtual-sps32-10epoch-lightfocus `
195
+ --init-model-dir checkpoints/dmhy-char-virtual-sps32-10epoch-lr1e5/final `
196
+ --epochs 1 `
197
  --batch-size 1792 `
198
+ --learning-rate 0.000002 `
199
+ --warmup-steps 20 `
200
  --max-seq-length 128 `
201
  --train-split 0.95 `
202
  --num-workers 0 `
 
204
  --save-total-limit 2 `
205
  --parse-eval-limit 2048 `
206
  --case-eval-file data/parser_regression_cases.json `
 
 
 
207
  --bf16 `
208
  --no-periodic-eval `
209
  --perf-log-steps 50 `
210
+ --perf-sample-interval 0.5 `
211
+ --seed 208 `
212
+ --experiment-name dmhy-char-virtual-sps32-10epoch-lightfocus
213
  ```
214
 
215
  The default quality gate is model-led parsing:
 
233
  仓库根目录就是 Hugging Face checkpoint 发布面。
234
 
235
  ```powershell
236
+ $final = "checkpoints/dmhy-char-virtual-sps32-10epoch-lightfocus/final"
237
  Copy-Item "$final/config.json" . -Force
238
  Copy-Item "$final/model.safetensors" . -Force
239
  Copy-Item "$final/tokenizer_config.json" . -Force
 
263
  提交前执行:
264
 
265
  ```powershell
266
+ uv run python -m compileall -q anifilebert tools
267
  uv run python -m tools.evaluate_parser_cases --model-dir . --case-file data/parser_regression_cases.json --output reports/case_metrics.json
268
  uv run python -m anifilebert.inference --model-dir . "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4"
269
  uv run python -m tools.onnx_inference "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv"
exports/anime_filename_parser.metadata.json CHANGED
@@ -8,5 +8,5 @@
8
  128,
9
  15
10
  ],
11
- "max_abs_diff": 1.9073486328125e-05
12
  }
 
8
  128,
9
  15
10
  ],
11
+ "max_abs_diff": 4.00543212890625e-05
12
  }
exports/anime_filename_parser.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5a09d5387e94373cccd22cd821edf0654d537a7897cf8abb04900f48a5ffaccf
3
  size 19647024
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:476e47f347eba767760bb5945f76ff2978a66fa31ea0a02f4c64f5264984aca5
3
  size 19647024
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:401c62d2359e1030930892fd6be3d8a25abf758f3cb43b4d445562890fe1f2c6
3
  size 19142604
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c421b6e37b49c6fe94cb3ac5cd2f8347749e6d21b0f7b5088b38ae30e6eb74f9
3
  size 19142604
reports/benchmark_results.json CHANGED
@@ -11,27 +11,27 @@
11
  "results": [
12
  {
13
  "name": "pytorch",
14
- "load_ms": 44.84080011025071,
15
  "runs": 520,
16
- "avg_ms": 16.417674036347307,
17
- "p50_ms": 14.76569997612387,
18
- "p95_ms": 26.30644003511406,
19
- "p99_ms": 32.615189072675996,
20
- "min_ms": 11.30899996496737,
21
- "max_ms": 41.87910002656281,
22
- "throughput_fps": 60.909967988527896
23
  },
24
  {
25
  "name": "onnxruntime",
26
- "load_ms": 40.69980001077056,
27
  "runs": 520,
28
- "avg_ms": 11.606730768779435,
29
- "p50_ms": 11.42695004818961,
30
- "p95_ms": 13.516889995662494,
31
- "p99_ms": 15.196251904126225,
32
- "min_ms": 9.510300005786121,
33
- "max_ms": 19.771000021137297,
34
- "throughput_fps": 86.15690498222524
35
  }
36
  ]
37
  }
 
11
  "results": [
12
  {
13
  "name": "pytorch",
14
+ "load_ms": 46.3533999864012,
15
  "runs": 520,
16
+ "avg_ms": 15.362302694120444,
17
+ "p50_ms": 14.245550031773746,
18
+ "p95_ms": 22.27204497321509,
19
+ "p99_ms": 29.752646028064174,
20
+ "min_ms": 10.793900000862777,
21
+ "max_ms": 42.94239997398108,
22
+ "throughput_fps": 65.09440803967013
23
  },
24
  {
25
  "name": "onnxruntime",
26
+ "load_ms": 50.916100037284195,
27
  "runs": 520,
28
+ "avg_ms": 12.039251922844695,
29
+ "p50_ms": 11.899999983143061,
30
+ "p95_ms": 13.811619929037988,
31
+ "p99_ms": 15.376427990850043,
32
+ "min_ms": 9.72980004735291,
33
+ "max_ms": 19.285599933937192,
34
+ "throughput_fps": 83.06163924541542
35
  }
36
  ]
37
  }
reports/case_metrics.json CHANGED
@@ -8,11 +8,11 @@
8
  "max_length": 128,
9
  "constrain_bio": false,
10
  "case_count": 26,
11
- "full_correct": 25,
12
- "full_accuracy": 0.9615384615384616,
13
  "field_correct": {
14
  "group": 22,
15
- "title": 26,
16
  "episode": 26,
17
  "resolution": 25,
18
  "source": 19,
@@ -35,7 +35,7 @@
35
  "season": 1.0,
36
  "source": 1.0,
37
  "special": 1.0,
38
- "title": 1.0
39
  },
40
  "failures": [
41
  {
@@ -64,6 +64,31 @@
64
  "source": "WebRip",
65
  "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
66
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  }
68
  ],
69
  "results": [
@@ -473,8 +498,13 @@
473
  {
474
  "id": "ai_raws_fire_force_cjk_season_hash_episode",
475
  "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
476
- "ok": true,
477
- "errors": {},
 
 
 
 
 
478
  "expected": {
479
  "group": "AI-Raws",
480
  "title": "炎炎の消防隊",
@@ -487,7 +517,7 @@
487
  "group": "AI-Raws",
488
  "resolution": "1920x1080",
489
  "season": 2,
490
- "title": "炎炎の消防隊"
491
  }
492
  },
493
  {
 
8
  "max_length": 128,
9
  "constrain_bio": false,
10
  "case_count": 26,
11
+ "full_correct": 24,
12
+ "full_accuracy": 0.9230769230769231,
13
  "field_correct": {
14
  "group": 22,
15
+ "title": 25,
16
  "episode": 26,
17
  "resolution": 25,
18
  "source": 19,
 
35
  "season": 1.0,
36
  "source": 1.0,
37
  "special": 1.0,
38
+ "title": 0.9615384615384616
39
  },
40
  "failures": [
41
  {
 
64
  "source": "WebRip",
65
  "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
66
  }
67
+ },
68
+ {
69
+ "id": "ai_raws_fire_force_cjk_season_hash_episode",
70
+ "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
71
+ "ok": false,
72
+ "errors": {
73
+ "title": {
74
+ "expected": "炎炎の消防隊",
75
+ "pred": "炎炎の消防隊 ノ章"
76
+ }
77
+ },
78
+ "expected": {
79
+ "group": "AI-Raws",
80
+ "title": "炎炎の消防隊",
81
+ "season": 2,
82
+ "episode": 13,
83
+ "resolution": "1920x1080"
84
+ },
85
+ "pred": {
86
+ "episode": 13,
87
+ "group": "AI-Raws",
88
+ "resolution": "1920x1080",
89
+ "season": 2,
90
+ "title": "炎炎の消防隊 ノ章"
91
+ }
92
  }
93
  ],
94
  "results": [
 
498
  {
499
  "id": "ai_raws_fire_force_cjk_season_hash_episode",
500
  "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
501
+ "ok": false,
502
+ "errors": {
503
+ "title": {
504
+ "expected": "炎炎の消防隊",
505
+ "pred": "炎炎の消防隊 ノ章"
506
+ }
507
+ },
508
  "expected": {
509
  "group": "AI-Raws",
510
  "title": "炎炎の消防隊",
 
517
  "group": "AI-Raws",
518
  "resolution": "1920x1080",
519
  "season": 2,
520
+ "title": "炎炎の消防隊 ノ章"
521
  }
522
  },
523
  {
reports/parse_eval_metrics.json CHANGED
@@ -5,22 +5,22 @@
5
  "constrain_bio": false,
6
  "sample_count": 2048,
7
  "field_accuracy": {
8
- "group": 0.98583984375,
9
- "title": 0.97119140625,
10
- "season": 0.99609375,
11
- "episode": 0.990234375,
12
  "resolution": 1.0,
13
- "source": 0.990234375,
14
- "special": 0.9775390625
15
  },
16
  "field_correct": {
17
- "group": 2019,
18
- "title": 1989,
19
- "season": 2040,
20
- "episode": 2028,
21
  "resolution": 2048,
22
- "source": 2028,
23
- "special": 2002
24
  },
25
  "field_total": {
26
  "group": 2048,
@@ -31,184 +31,245 @@
31
  "source": 2048,
32
  "special": 2048
33
  },
34
- "full_match_accuracy": 0.95068359375,
35
- "full_match_correct": 1947,
36
  "full_match_total": 2048,
37
  "failures": [
38
  {
39
- "filename": "01; - flac",
40
  "errors": {
41
- "title": {
42
- "gold": "01;",
43
- "pred": ";"
44
- },
45
- "episode": {
46
- "gold": null,
47
- "pred": "1"
48
  }
49
  },
50
  "gold": {
51
- "group": null,
52
- "title": "01;",
53
  "season": null,
54
  "episode": null,
55
  "resolution": null,
56
- "source": "flac",
57
- "special": null
58
  },
59
  "pred": {
60
- "group": null,
61
- "title": ";",
62
  "season": null,
63
- "episode": 1,
64
  "resolution": null,
65
- "source": "flac",
66
- "special": null
67
  }
68
  },
69
  {
70
- "filename": "[VCB-Studio] Durarara!!×2 Shou [IV][Ma10p_1080p][x265_aac]",
71
  "errors": {
72
- "title": {
73
- "gold": "durarara!!×2 shou iv",
74
- "pred": "durarara!!×2 shou"
75
- },
76
  "special": {
77
- "gold": null,
78
- "pred": "iv"
79
  }
80
  },
81
  "gold": {
82
- "group": "VCB-Studio",
83
- "title": "Durarara!!×2 Shou IV",
84
  "season": null,
85
  "episode": null,
86
- "resolution": "1080p",
87
- "source": "x265-aac",
88
- "special": null
89
  },
90
  "pred": {
91
- "group": "VCB-Studio",
92
- "title": "Durarara!!×2 Shou",
93
  "season": null,
94
  "episode": null,
95
- "resolution": "1080p",
96
- "source": "x265-aac",
97
- "special": "IV"
98
  }
99
  },
100
  {
101
- "filename": "AC3 Chap - BD Menu16",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  "errors": {
103
- "source": {
104
- "gold": "bd",
105
- "pred": null
106
- },
107
  "special": {
108
- "gold": "menu16",
109
- "pred": "bd menu16"
110
  }
111
  },
112
  "gold": {
113
  "group": null,
114
- "title": "AC3 Chap",
115
  "season": null,
116
  "episode": null,
117
  "resolution": null,
118
- "source": "BD",
119
- "special": "Menu16"
120
  },
121
  "pred": {
122
  "group": null,
123
- "title": "AC3 Chap",
124
  "season": null,
125
  "episode": null,
126
  "resolution": null,
127
  "source": null,
128
- "special": "BD Menu16"
129
  }
130
  },
131
  {
132
- "filename": "Puella Magi Madoka Magica - BD Menu14",
133
  "errors": {
134
- "source": {
135
- "gold": "bd",
136
- "pred": null
137
  },
138
- "special": {
139
- "gold": "menu14",
140
- "pred": "bd menu14"
141
  }
142
  },
143
  "gold": {
144
  "group": null,
145
- "title": "Puella Magi Madoka Magica",
146
- "season": null,
147
  "episode": null,
148
  "resolution": null,
149
- "source": "BD",
150
- "special": "Menu14"
151
  },
152
  "pred": {
153
  "group": null,
154
- "title": "Puella Magi Madoka Magica",
155
- "season": null,
156
- "episode": null,
157
  "resolution": null,
158
  "source": null,
159
- "special": "BD Menu14"
160
  }
161
  },
162
  {
163
- "filename": "Kirion Hikaru no Go 72 [960x720]",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  "errors": {
165
- "group": {
166
- "gold": "kirion",
167
- "pred": null
168
- },
169
  "title": {
170
- "gold": "hikaru no go",
171
- "pred": "kirion hikaru no go 72"
172
  },
173
  "episode": {
174
- "gold": "72",
175
- "pred": null
176
  }
177
  },
178
  "gold": {
179
- "group": "Kirion",
180
- "title": "Hikaru no Go",
181
  "season": null,
182
- "episode": 72,
183
- "resolution": "960x720",
184
- "source": null,
185
  "special": null
186
  },
187
  "pred": {
188
  "group": null,
189
- "title": "Kirion Hikaru no Go 72",
190
  "season": null,
191
- "episode": null,
192
- "resolution": "960x720",
193
- "source": null,
194
  "special": null
195
  }
196
  },
197
  {
198
- "filename": "葬送的芙莉莲 - 喵萌奶茶屋",
199
  "errors": {
200
- "group": {
201
- "gold": "喵萌奶茶屋",
202
- "pred": null
203
- },
204
  "title": {
205
- "gold": "葬送的芙莉莲",
206
- "pred": "葬送的芙莉莲 - 喵萌奶茶屋"
207
  }
208
  },
209
  "gold": {
210
- "group": "喵萌奶茶屋",
211
- "title": "葬送的芙莉莲",
212
  "season": null,
213
  "episode": null,
214
  "resolution": null,
@@ -217,7 +278,7 @@
217
  },
218
  "pred": {
219
  "group": null,
220
- "title": "葬送的芙莉莲 - 喵萌奶茶屋",
221
  "season": null,
222
  "episode": null,
223
  "resolution": null,
@@ -226,100 +287,96 @@
226
  }
227
  },
228
  {
229
- "filename": "[VCB-Studio] Taimadou Gakuen 35 Shiken Shoutai [IV][Ma10p_1080p][x265_aac]",
230
  "errors": {
231
  "title": {
232
- "gold": "taimadou gakuen 35 shiken shoutai iv",
233
- "pred": "taimadou gakuen 35 shiken shoutai v"
234
  }
235
  },
236
  "gold": {
237
- "group": "VCB-Studio",
238
- "title": "Taimadou Gakuen 35 Shiken Shoutai IV",
239
  "season": null,
240
  "episode": null,
241
- "resolution": "1080p",
242
- "source": "x265-aac",
243
- "special": null
244
  },
245
  "pred": {
246
- "group": "VCB-Studio",
247
- "title": "Taimadou Gakuen 35 Shiken Shoutai V",
248
  "season": null,
249
  "episode": null,
250
- "resolution": "1080p",
251
- "source": "x265-aac",
252
- "special": null
253
  }
254
  },
255
  {
256
- "filename": "Dr.Slump.Arale-chan.097 - BD Menu14",
257
  "errors": {
258
- "source": {
259
- "gold": "bd",
260
- "pred": null
261
- },
262
  "special": {
263
- "gold": "menu14",
264
- "pred": "bd menu14"
265
  }
266
  },
267
  "gold": {
268
- "group": null,
269
- "title": "Dr.Slump.Arale-chan.097",
270
  "season": null,
271
- "episode": null,
272
  "resolution": null,
273
- "source": "BD",
274
- "special": "Menu14"
275
  },
276
  "pred": {
277
- "group": null,
278
- "title": "Dr.Slump.Arale-chan.097",
279
  "season": null,
280
- "episode": null,
281
  "resolution": null,
282
  "source": null,
283
- "special": "BD Menu14"
284
  }
285
  },
286
  {
287
- "filename": "Shiroi Suna no Aquatope - BD Menu05",
288
  "errors": {
289
- "source": {
290
- "gold": "bd",
291
- "pred": null
292
  },
293
- "special": {
294
- "gold": "menu05",
295
- "pred": "bd menu05"
296
  }
297
  },
298
  "gold": {
299
  "group": null,
300
- "title": "Shiroi Suna no Aquatope",
301
  "season": null,
302
  "episode": null,
303
- "resolution": null,
304
- "source": "BD",
305
- "special": "Menu05"
306
  },
307
  "pred": {
308
  "group": null,
309
- "title": "Shiroi Suna no Aquatope",
310
  "season": null,
311
  "episode": null,
312
- "resolution": null,
313
- "source": null,
314
- "special": "BD Menu05"
315
  }
316
  },
317
  {
318
- "filename": "[VCB-Studio] Mob Psycho 100 II [NCOP_OVA][Ma10p_1080p][x265_flac]",
319
  "errors": {
320
  "season": {
321
  "gold": null,
322
- "pred": "2"
323
  }
324
  },
325
  "gold": {
@@ -329,326 +386,213 @@
329
  "episode": 100,
330
  "resolution": "1080p",
331
  "source": "x265-flac",
332
- "special": "OVA"
333
  },
334
  "pred": {
335
  "group": "VCB-Studio",
336
  "title": "Mob Psycho",
337
- "season": 2,
338
  "episode": 100,
339
  "resolution": "1080p",
340
  "source": "x265-flac",
341
- "special": "OVA"
342
  }
343
  },
344
  {
345
- "filename": "アニメCM宣伝1",
346
  "errors": {
347
- "group": {
348
- "gold": null,
349
- "pred": "ア"
350
- },
351
- "title": {
352
- "gold": "アニメ",
353
- "pred": "ニ 宣伝"
354
- },
355
  "special": {
356
- "gold": "cm",
357
- "pred": "1"
358
  }
359
  },
360
  "gold": {
361
- "group": null,
362
- "title": "アニメ",
363
  "season": null,
364
  "episode": null,
365
- "resolution": null,
366
- "source": null,
367
- "special": "CM"
368
  },
369
  "pred": {
370
- "group": "",
371
- "title": "ニ 宣伝",
372
  "season": null,
373
  "episode": null,
374
- "resolution": null,
375
- "source": null,
376
- "special": "1"
377
  }
378
  },
379
  {
380
- "filename": "960x720 SLAM DUNK LoliHouse",
381
  "errors": {
382
- "group": {
383
- "gold": "lolihouse",
384
- "pred": "ihouse"
385
- },
386
  "title": {
387
- "gold": "slam dunk",
388
- "pred": "slam dunk lol"
389
  }
390
  },
391
  "gold": {
392
- "group": "LoliHouse",
393
- "title": "SLAM DUNK",
394
  "season": null,
395
  "episode": null,
396
- "resolution": "960x720",
397
- "source": null,
398
  "special": null
399
  },
400
  "pred": {
401
- "group": "iHouse",
402
- "title": "SLAM DUNK Lol",
403
  "season": null,
404
  "episode": null,
405
- "resolution": "960x720",
406
- "source": null,
407
  "special": null
408
  }
409
  },
410
  {
411
- "filename": "GM-Team [GB] 4K 逆天邪神 04",
412
  "errors": {
413
  "title": {
414
- "gold": "逆天邪神",
415
- "pred": "逆天邪神 04"
416
- },
417
- "episode": {
418
- "gold": "4",
419
- "pred": null
420
- }
421
- },
422
- "gold": {
423
- "group": "GM-Team",
424
- "title": "逆天邪神",
425
- "season": null,
426
- "episode": 4,
427
- "resolution": "4K",
428
- "source": "GB",
429
- "special": null
430
- },
431
- "pred": {
432
- "group": "GM-Team",
433
- "title": "逆天邪神 04",
434
- "season": null,
435
- "episode": null,
436
- "resolution": "4K",
437
- "source": "GB",
438
- "special": null
439
- }
440
- },
441
- {
442
- "filename": "CM 15 - BD - CITY HUNTER TV 3rd & '91 Series",
443
- "errors": {
444
- "source": {
445
- "gold": "bd",
446
- "pred": "b"
447
  },
448
  "special": {
449
- "gold": "cm 15",
450
- "pred": "d"
451
  }
452
  },
453
  "gold": {
454
- "group": null,
455
- "title": "CITY HUNTER TV 3rd & '91 Series",
456
  "season": null,
457
  "episode": null,
458
- "resolution": null,
459
- "source": "BD",
460
- "special": "CM 15"
461
  },
462
  "pred": {
463
- "group": null,
464
- "title": "CITY HUNTER TV 3rd & '91 Series",
465
  "season": null,
466
  "episode": null,
467
- "resolution": null,
468
- "source": "B",
469
- "special": "D"
470
  }
471
  },
472
  {
473
- "filename": "12 - HEVC",
474
  "errors": {
475
- "episode": {
476
- "gold": null,
477
- "pred": "12"
478
- },
479
- "special": {
480
- "gold": "12",
481
- "pred": null
482
  }
483
  },
484
  "gold": {
485
  "group": null,
486
- "title": null,
487
  "season": null,
488
- "episode": null,
489
- "resolution": null,
490
- "source": "HEVC",
491
- "special": "12"
492
  },
493
  "pred": {
494
  "group": null,
495
- "title": null,
496
  "season": null,
497
- "episode": 12,
498
- "resolution": null,
499
- "source": "HEVC",
500
- "special": null
501
  }
502
  },
503
  {
504
- "filename": "07][檢索:魔法姊妹露露特莉莉][CHT&JPN",
505
  "errors": {
506
  "title": {
507
- "gold": null,
508
- "pred": "檢索:魔法姊妹露露特莉莉"
509
- },
510
- "special": {
511
- "gold": "檢索:魔法姊妹露露特莉莉",
512
- "pred": null
513
  }
514
  },
515
  "gold": {
516
- "group": null,
517
- "title": null,
518
- "season": null,
519
- "episode": 7,
520
- "resolution": null,
521
- "source": "CHT&JPN",
522
- "special": "檢索:魔法姊妹露露特莉莉"
523
  },
524
  "pred": {
525
- "group": null,
526
- "title": "檢索:魔法姊妹露露特莉莉",
527
- "season": null,
528
- "episode": 7,
529
- "resolution": null,
530
- "source": "CHT&JPN",
531
  "special": null
532
  }
533
  },
534
  {
535
- "filename": "[12]_1080P_Baha_[ANi]_29 歲單身中堅冒險家的日常",
536
  "errors": {
537
  "title": {
538
- "gold": "29 歲單身中堅冒險家的日常",
539
- "pred": "9 歲單身中堅冒險家的日常"
 
 
 
 
540
  }
541
  },
542
  "gold": {
543
- "group": "ANi",
544
- "title": "29 歲單身中堅冒險家的日常",
545
  "season": null,
546
- "episode": 12,
547
  "resolution": "1080P",
548
- "source": "Baha",
549
  "special": null
550
  },
551
  "pred": {
552
- "group": "ANi",
553
- "title": "9 歲單身中堅冒險家的日常",
554
- "season": null,
555
- "episode": 12,
556
  "resolution": "1080P",
557
- "source": "Baha",
558
  "special": null
559
  }
560
  },
561
  {
562
- "filename": "[CM_03] Onimonogatari SFEO-Raws",
563
  "errors": {
564
- "group": {
565
- "gold": "sfeo-raws",
566
- "pred": "feo-raws"
567
- },
568
  "title": {
569
- "gold": "onimonogatari",
570
- "pred": "onimonogatari s"
571
  }
572
  },
573
  "gold": {
574
- "group": "SFEO-Raws",
575
- "title": "Onimonogatari",
576
  "season": null,
577
  "episode": null,
578
  "resolution": null,
579
- "source": null,
580
- "special": "CM_03"
581
  },
582
  "pred": {
583
- "group": "FEO-Raws",
584
- "title": "Onimonogatari S",
585
  "season": null,
586
  "episode": null,
587
  "resolution": null,
588
- "source": null,
589
- "special": "CM_03"
590
- }
591
- },
592
- {
593
- "filename": "16][S2][[DBD-Raws]",
594
- "errors": {
595
- "episode": {
596
- "gold": null,
597
- "pred": "16"
598
- },
599
- "special": {
600
- "gold": "16",
601
- "pred": null
602
- }
603
- },
604
- "gold": {
605
- "group": "DBD-Raws",
606
- "title": null,
607
- "season": 2,
608
- "episode": null,
609
- "resolution": null,
610
- "source": null,
611
- "special": "16"
612
- },
613
- "pred": {
614
- "group": "DBD-Raws",
615
- "title": null,
616
- "season": 2,
617
- "episode": 16,
618
- "resolution": null,
619
- "source": null,
620
- "special": null
621
- }
622
- },
623
- {
624
- "filename": "Captain.Tsubasa.Road.to.2002.Intro1.BGSubs.TVRip.XviD-TBO",
625
- "errors": {
626
- "group": {
627
- "gold": null,
628
- "pred": "gsu"
629
- },
630
- "title": {
631
- "gold": "captain.tsubasa.road.to intro1.bgsubs",
632
- "pred": "captain.tsubasa.road.to intro1.b b id o"
633
- }
634
- },
635
- "gold": {
636
- "group": null,
637
- "title": "Captain.Tsubasa.Road.to Intro1.BGSubs",
638
- "season": null,
639
- "episode": null,
640
- "resolution": null,
641
- "source": "TVRip",
642
- "special": null
643
- },
644
- "pred": {
645
- "group": "GSu",
646
- "title": "Captain.Tsubasa.Road.to Intro1.B b iD O",
647
- "season": null,
648
- "episode": null,
649
- "resolution": null,
650
- "source": "TVRip",
651
- "special": null
652
  }
653
  }
654
  ]
@@ -657,22 +601,22 @@
657
  "constrain_bio": true,
658
  "sample_count": 2048,
659
  "field_accuracy": {
660
- "group": 0.990234375,
661
- "title": 0.978515625,
662
- "season": 0.99755859375,
663
- "episode": 0.9912109375,
664
  "resolution": 1.0,
665
- "source": 0.990234375,
666
- "special": 0.9794921875
667
  },
668
  "field_correct": {
669
- "group": 2028,
670
- "title": 2004,
671
- "season": 2043,
672
- "episode": 2030,
673
  "resolution": 2048,
674
- "source": 2028,
675
- "special": 2006
676
  },
677
  "field_total": {
678
  "group": 2048,
@@ -683,620 +627,588 @@
683
  "source": 2048,
684
  "special": 2048
685
  },
686
- "full_match_accuracy": 0.9599609375,
687
- "full_match_correct": 1966,
688
  "full_match_total": 2048,
689
  "failures": [
690
  {
691
- "filename": "01; - flac",
692
  "errors": {
693
  "title": {
694
- "gold": "01;",
695
- "pred": null
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
696
  },
697
  "episode": {
698
  "gold": null,
699
- "pred": "1"
700
  }
701
  },
702
  "gold": {
703
  "group": null,
704
- "title": "01;",
705
- "season": null,
706
  "episode": null,
707
  "resolution": null,
708
- "source": "flac",
709
- "special": null
710
  },
711
  "pred": {
712
  "group": null,
713
- "title": null,
714
- "season": null,
715
- "episode": 1,
716
  "resolution": null,
717
- "source": "flac",
718
- "special": null
719
  }
720
  },
721
  {
722
- "filename": "[VCB-Studio] Durarara!!×2 Shou [IV][Ma10p_1080p][x265_aac]",
723
  "errors": {
724
  "title": {
725
- "gold": "durarara!!×2 shou iv",
726
- "pred": "durarara!!×2 shou"
727
- },
728
- "special": {
729
- "gold": null,
730
- "pred": "iv"
731
  }
732
  },
733
  "gold": {
734
- "group": "VCB-Studio",
735
- "title": "Durarara!!×2 Shou IV",
736
  "season": null,
737
  "episode": null,
738
- "resolution": "1080p",
739
- "source": "x265-aac",
740
- "special": null
741
  },
742
  "pred": {
743
- "group": "VCB-Studio",
744
- "title": "Durarara!!×2 Shou",
745
  "season": null,
746
  "episode": null,
747
- "resolution": "1080p",
748
- "source": "x265-aac",
749
- "special": "IV"
750
  }
751
  },
752
  {
753
- "filename": "AC3 Chap - BD Menu16",
754
  "errors": {
755
- "source": {
756
- "gold": "bd",
757
- "pred": null
758
- },
759
- "special": {
760
- "gold": "menu16",
761
- "pred": "bd menu16"
762
  }
763
  },
764
  "gold": {
765
  "group": null,
766
- "title": "AC3 Chap",
767
  "season": null,
768
  "episode": null,
769
  "resolution": null,
770
- "source": "BD",
771
- "special": "Menu16"
772
  },
773
  "pred": {
774
  "group": null,
775
- "title": "AC3 Chap",
776
  "season": null,
777
  "episode": null,
778
  "resolution": null,
779
  "source": null,
780
- "special": "BD Menu16"
781
  }
782
  },
783
  {
784
- "filename": "Puella Magi Madoka Magica - BD Menu14",
785
  "errors": {
786
- "source": {
787
- "gold": "bd",
788
- "pred": null
789
- },
790
  "special": {
791
- "gold": "menu14",
792
- "pred": "bd menu14"
793
  }
794
  },
795
  "gold": {
796
- "group": null,
797
- "title": "Puella Magi Madoka Magica",
798
  "season": null,
799
- "episode": null,
800
  "resolution": null,
801
- "source": "BD",
802
- "special": "Menu14"
803
  },
804
  "pred": {
805
- "group": null,
806
- "title": "Puella Magi Madoka Magica",
807
  "season": null,
808
- "episode": null,
809
  "resolution": null,
810
  "source": null,
811
- "special": "BD Menu14"
812
  }
813
  },
814
  {
815
- "filename": "Kirion Hikaru no Go 72 [960x720]",
816
  "errors": {
817
- "group": {
818
- "gold": "kirion",
819
- "pred": null
820
- },
821
  "title": {
822
- "gold": "hikaru no go",
823
- "pred": "kirion hikaru no go 72"
824
  },
825
- "episode": {
826
- "gold": "72",
827
- "pred": null
828
  }
829
  },
830
  "gold": {
831
- "group": "Kirion",
832
- "title": "Hikaru no Go",
833
  "season": null,
834
- "episode": 72,
835
- "resolution": "960x720",
836
- "source": null,
837
- "special": null
838
  },
839
  "pred": {
840
  "group": null,
841
- "title": "Kirion Hikaru no Go 72",
842
  "season": null,
843
  "episode": null,
844
- "resolution": "960x720",
845
- "source": null,
846
- "special": null
847
  }
848
  },
849
  {
850
- "filename": "葬送的芙莉莲 - 喵萌奶茶屋",
851
  "errors": {
852
- "group": {
853
- "gold": "喵萌奶茶屋",
854
- "pred": null
855
- },
856
- "title": {
857
- "gold": "葬送的芙莉莲",
858
- "pred": "葬送的芙莉莲 - 喵萌奶茶屋"
859
  }
860
  },
861
  "gold": {
862
- "group": "喵萌奶茶屋",
863
- "title": "葬送的芙莉莲",
864
  "season": null,
865
- "episode": null,
866
- "resolution": null,
867
- "source": null,
868
- "special": null
869
  },
870
  "pred": {
871
- "group": null,
872
- "title": "葬送的芙莉莲 - 喵萌奶茶屋",
873
- "season": null,
874
- "episode": null,
875
- "resolution": null,
876
- "source": null,
877
- "special": null
878
  }
879
  },
880
  {
881
- "filename": "[VCB-Studio] Taimadou Gakuen 35 Shiken Shoutai [IV][Ma10p_1080p][x265_aac]",
882
  "errors": {
883
- "title": {
884
- "gold": "taimadou gakuen 35 shiken shoutai iv",
885
- "pred": "taimadou gakuen 35 shiken shoutai"
886
- },
887
  "special": {
888
- "gold": null,
889
- "pred": "iv"
890
  }
891
  },
892
  "gold": {
893
- "group": "VCB-Studio",
894
- "title": "Taimadou Gakuen 35 Shiken Shoutai IV",
895
  "season": null,
896
  "episode": null,
897
- "resolution": "1080p",
898
- "source": "x265-aac",
899
- "special": null
900
  },
901
  "pred": {
902
- "group": "VCB-Studio",
903
- "title": "Taimadou Gakuen 35 Shiken Shoutai",
904
  "season": null,
905
  "episode": null,
906
- "resolution": "1080p",
907
- "source": "x265-aac",
908
- "special": "IV"
909
  }
910
  },
911
  {
912
- "filename": "Dr.Slump.Arale-chan.097 - BD Menu14",
913
  "errors": {
914
- "source": {
915
- "gold": "bd",
916
  "pred": null
917
- },
918
- "special": {
919
- "gold": "menu14",
920
- "pred": "bd menu14"
921
  }
922
  },
923
  "gold": {
924
- "group": null,
925
- "title": "Dr.Slump.Arale-chan.097",
926
  "season": null,
927
  "episode": null,
928
  "resolution": null,
929
  "source": "BD",
930
- "special": "Menu14"
931
  },
932
  "pred": {
933
- "group": null,
934
- "title": "Dr.Slump.Arale-chan.097",
935
  "season": null,
936
  "episode": null,
937
  "resolution": null,
938
- "source": null,
939
- "special": "BD Menu14"
940
  }
941
  },
942
  {
943
- "filename": "Shiroi Suna no Aquatope - BD Menu05",
944
  "errors": {
945
- "source": {
946
- "gold": "bd",
947
- "pred": null
948
  },
949
  "special": {
950
- "gold": "menu05",
951
- "pred": "bd menu05"
952
  }
953
  },
954
  "gold": {
955
- "group": null,
956
- "title": "Shiroi Suna no Aquatope",
957
  "season": null,
958
  "episode": null,
959
- "resolution": null,
960
- "source": "BD",
961
- "special": "Menu05"
962
  },
963
  "pred": {
964
- "group": null,
965
- "title": "Shiroi Suna no Aquatope",
966
  "season": null,
967
  "episode": null,
968
- "resolution": null,
969
- "source": null,
970
- "special": "BD Menu05"
971
  }
972
  },
973
  {
974
- "filename": "[VCB-Studio] Mob Psycho 100 II [NCOP_OVA][Ma10p_1080p][x265_flac]",
975
  "errors": {
 
 
 
 
976
  "season": {
977
  "gold": null,
978
  "pred": "2"
979
  }
980
  },
981
  "gold": {
982
- "group": "VCB-Studio",
983
- "title": "Mob Psycho",
984
  "season": null,
985
- "episode": 100,
986
- "resolution": "1080p",
987
- "source": "x265-flac",
988
- "special": "OVA"
989
  },
990
  "pred": {
991
- "group": "VCB-Studio",
992
- "title": "Mob Psycho",
993
  "season": 2,
994
- "episode": 100,
995
- "resolution": "1080p",
996
- "source": "x265-flac",
997
- "special": "OVA"
998
  }
999
  },
1000
  {
1001
- "filename": "アニメCM宣伝1",
1002
  "errors": {
1003
  "title": {
1004
- "gold": "アニメ",
1005
- "pred": "アニメcm宣伝1"
1006
- },
1007
- "special": {
1008
- "gold": "cm",
1009
- "pred": null
1010
  }
1011
  },
1012
  "gold": {
1013
- "group": null,
1014
- "title": "アニメ",
1015
  "season": null,
1016
  "episode": null,
1017
  "resolution": null,
1018
- "source": null,
1019
- "special": "CM"
1020
  },
1021
  "pred": {
1022
- "group": null,
1023
- "title": "アニメCM宣伝1",
1024
  "season": null,
1025
  "episode": null,
1026
  "resolution": null,
1027
- "source": null,
1028
  "special": null
1029
  }
1030
  },
1031
  {
1032
- "filename": "GM-Team [GB] 4K 逆天邪神 04",
1033
  "errors": {
1034
- "title": {
1035
- "gold": "逆天邪神",
1036
- "pred": "逆天邪神 04"
1037
- },
1038
- "episode": {
1039
- "gold": "4",
1040
- "pred": null
1041
  }
1042
  },
1043
  "gold": {
1044
- "group": "GM-Team",
1045
- "title": "逆天邪神",
1046
  "season": null,
1047
- "episode": 4,
1048
- "resolution": "4K",
1049
- "source": "GB",
1050
  "special": null
1051
  },
1052
  "pred": {
1053
- "group": "GM-Team",
1054
- "title": "逆天邪神 04",
1055
- "season": null,
1056
- "episode": null,
1057
- "resolution": "4K",
1058
- "source": "GB",
1059
  "special": null
1060
  }
1061
  },
1062
  {
1063
- "filename": "CM 15 - BD - CITY HUNTER TV 3rd & '91 Series",
1064
  "errors": {
1065
- "source": {
1066
- "gold": "bd",
1067
- "pred": null
1068
- },
1069
- "special": {
1070
- "gold": "cm 15",
1071
- "pred": "bd"
1072
  }
1073
  },
1074
  "gold": {
1075
- "group": null,
1076
- "title": "CITY HUNTER TV 3rd & '91 Series",
1077
  "season": null,
1078
  "episode": null,
1079
- "resolution": null,
1080
- "source": "BD",
1081
- "special": "CM 15"
1082
  },
1083
  "pred": {
1084
- "group": null,
1085
- "title": "CITY HUNTER TV 3rd & '91 Series",
1086
  "season": null,
1087
  "episode": null,
1088
- "resolution": null,
1089
- "source": null,
1090
- "special": "BD"
1091
  }
1092
  },
1093
  {
1094
- "filename": "12 - HEVC",
1095
  "errors": {
1096
- "episode": {
1097
- "gold": null,
1098
- "pred": "12"
1099
- },
1100
- "special": {
1101
- "gold": "12",
1102
- "pred": null
1103
  }
1104
  },
1105
  "gold": {
1106
- "group": null,
1107
- "title": null,
1108
  "season": null,
1109
  "episode": null,
1110
- "resolution": null,
1111
- "source": "HEVC",
1112
- "special": "12"
1113
  },
1114
  "pred": {
1115
- "group": null,
1116
- "title": null,
1117
  "season": null,
1118
- "episode": 12,
1119
- "resolution": null,
1120
- "source": "HEVC",
1121
- "special": null
1122
  }
1123
  },
1124
  {
1125
- "filename": "07][檢索:魔法姊妹露露特莉莉][CHT&JPN",
1126
  "errors": {
1127
  "title": {
1128
- "gold": null,
1129
- "pred": "檢索:魔法姊妹露露特莉莉"
1130
  },
1131
- "special": {
1132
- "gold": "檢索:魔法姊妹露露特莉莉",
1133
  "pred": null
 
 
 
 
1134
  }
1135
  },
1136
  "gold": {
1137
- "group": null,
1138
- "title": null,
1139
  "season": null,
1140
- "episode": 7,
1141
- "resolution": null,
1142
- "source": "CHT&JPN",
1143
- "special": "檢索:魔法姊妹露露特莉莉"
1144
  },
1145
  "pred": {
1146
- "group": null,
1147
- "title": "檢索:魔法姊妹露露特莉莉",
1148
  "season": null,
1149
- "episode": 7,
1150
- "resolution": null,
1151
- "source": "CHT&JPN",
1152
- "special": null
1153
  }
1154
  },
1155
  {
1156
- "filename": "[12]_1080P_Baha_[ANi]_29 歲單身中堅冒險家的日常",
1157
  "errors": {
1158
  "title": {
1159
- "gold": "29 歲單身中堅冒險家的日常",
1160
- "pred": "歲單身中堅冒險家的日常"
 
 
 
 
1161
  }
1162
  },
1163
  "gold": {
1164
- "group": "ANi",
1165
- "title": "29 歲單身中堅冒險家的日常",
1166
  "season": null,
1167
- "episode": 12,
1168
- "resolution": "1080P",
1169
- "source": "Baha",
1170
  "special": null
1171
  },
1172
  "pred": {
1173
- "group": "ANi",
1174
- "title": "歲單身中堅冒險家的日常",
1175
- "season": null,
1176
- "episode": 12,
1177
- "resolution": "1080P",
1178
- "source": "Baha",
1179
  "special": null
1180
  }
1181
  },
1182
  {
1183
- "filename": "16][S2][[DBD-Raws]",
1184
  "errors": {
 
 
 
 
1185
  "episode": {
1186
- "gold": null,
1187
- "pred": "16"
1188
  },
1189
  "special": {
1190
- "gold": "16",
1191
- "pred": null
1192
  }
1193
  },
1194
  "gold": {
1195
- "group": "DBD-Raws",
1196
- "title": null,
1197
  "season": 2,
1198
- "episode": null,
1199
- "resolution": null,
1200
- "source": null,
1201
- "special": "16"
1202
  },
1203
  "pred": {
1204
- "group": "DBD-Raws",
1205
- "title": null,
1206
- "season": 2,
1207
- "episode": 16,
1208
- "resolution": null,
1209
- "source": null,
1210
- "special": null
1211
  }
1212
  },
1213
  {
1214
- "filename": "Captain.Tsubasa.Road.to.2002.Intro1.BGSubs.TVRip.XviD-TBO",
1215
  "errors": {
1216
- "group": {
1217
- "gold": null,
1218
- "pred": "bgsubs"
1219
- },
1220
- "title": {
1221
- "gold": "captain.tsubasa.road.to intro1.bgsubs",
1222
- "pred": "captain.tsubasa.road.to intro1"
1223
  }
1224
  },
1225
  "gold": {
1226
- "group": null,
1227
- "title": "Captain.Tsubasa.Road.to Intro1.BGSubs",
1228
- "season": null,
1229
- "episode": null,
1230
- "resolution": null,
1231
- "source": "TVRip",
1232
- "special": null
1233
  },
1234
  "pred": {
1235
- "group": "BGSubs",
1236
- "title": "Captain.Tsubasa.Road.to Intro1",
1237
  "season": null,
1238
- "episode": null,
1239
- "resolution": null,
1240
- "source": "TVRip",
1241
- "special": null
1242
  }
1243
  },
1244
  {
1245
- "filename": "HNK 006 ITA audio + jap audio + sub ita",
1246
  "errors": {
1247
  "title": {
1248
- "gold": "hnk 006 ita audio + audio + sub ita",
1249
- "pred": "hnk 006 ita audio + jap audio + sub ita"
 
 
 
 
1250
  }
1251
  },
1252
  "gold": {
1253
  "group": null,
1254
- "title": "HNK 006 ITA audio + audio + sub ita",
1255
- "season": null,
1256
- "episode": null,
1257
  "resolution": null,
1258
  "source": null,
1259
- "special": null
1260
  },
1261
  "pred": {
1262
  "group": null,
1263
- "title": "HNK 006 ITA audio + jap audio + sub ita",
1264
- "season": null,
1265
  "episode": null,
1266
  "resolution": null,
1267
  "source": null,
1268
- "special": null
1269
- }
1270
- },
1271
- {
1272
- "filename": "Babylon][subbers][10][WebRip",
1273
- "errors": {
1274
- "group": {
1275
- "gold": "subbers",
1276
- "pred": "babylon"
1277
- },
1278
- "title": {
1279
- "gold": "babylon",
1280
- "pred": "subbers"
1281
- }
1282
- },
1283
- "gold": {
1284
- "group": "subbers",
1285
- "title": "Babylon",
1286
- "season": null,
1287
- "episode": 10,
1288
- "resolution": null,
1289
- "source": "WebRip",
1290
- "special": null
1291
- },
1292
- "pred": {
1293
- "group": "Babylon",
1294
- "title": "subbers",
1295
- "season": null,
1296
- "episode": 10,
1297
- "resolution": null,
1298
- "source": "WebRip",
1299
- "special": null
1300
  }
1301
  }
1302
  ]
 
5
  "constrain_bio": false,
6
  "sample_count": 2048,
7
  "field_accuracy": {
8
+ "group": 0.99951171875,
9
+ "title": 0.9716796875,
10
+ "season": 0.994140625,
11
+ "episode": 0.99609375,
12
  "resolution": 1.0,
13
+ "source": 0.99658203125,
14
+ "special": 0.98681640625
15
  },
16
  "field_correct": {
17
+ "group": 2047,
18
+ "title": 1990,
19
+ "season": 2036,
20
+ "episode": 2040,
21
  "resolution": 2048,
22
+ "source": 2041,
23
+ "special": 2021
24
  },
25
  "field_total": {
26
  "group": 2048,
 
31
  "source": 2048,
32
  "special": 2048
33
  },
34
+ "full_match_accuracy": 0.9580078125,
35
+ "full_match_correct": 1962,
36
  "full_match_total": 2048,
37
  "failures": [
38
  {
39
+ "filename": "[U3-Project] Shoujo Kageki Revue Starlight - NCED1 [BDRemux AVC 1920x1080p FLAC]",
40
  "errors": {
41
+ "source": {
42
+ "gold": "avc",
43
+ "pred": "mu"
 
 
 
 
44
  }
45
  },
46
  "gold": {
47
+ "group": "U3-Project",
48
+ "title": "Shoujo Kageki Revue Starlight",
49
  "season": null,
50
  "episode": null,
51
  "resolution": null,
52
+ "source": "AVC",
53
+ "special": "NCED1"
54
  },
55
  "pred": {
56
+ "group": "U3-Project",
57
+ "title": "Shoujo Kageki Revue Starlight",
58
  "season": null,
59
+ "episode": null,
60
  "resolution": null,
61
+ "source": "mu",
62
+ "special": "NCED1"
63
  }
64
  },
65
  {
66
+ "filename": "[SFEO-Raws] Kamisama Hajimemashita - NCED_11 (BD 720P x264 10bit AAC)[CED90B97]",
67
  "errors": {
 
 
 
 
68
  "special": {
69
+ "gold": "nced_11",
70
+ "pred": "d"
71
  }
72
  },
73
  "gold": {
74
+ "group": "SFEO-Raws",
75
+ "title": "Kamisama Hajimemashita",
76
  "season": null,
77
  "episode": null,
78
+ "resolution": "720P",
79
+ "source": "BD",
80
+ "special": "NCED_11"
81
  },
82
  "pred": {
83
+ "group": "SFEO-Raws",
84
+ "title": "Kamisama Hajimemashita",
85
  "season": null,
86
  "episode": null,
87
+ "resolution": "720P",
88
+ "source": "BD",
89
+ "special": "D"
90
  }
91
  },
92
  {
93
+ "filename": "[QTS] Yoroiden Samurai Troopers Blu-ray BOX Eizou Tokuten - TV NCOP2 (BD Hi10P 960x720 AAC)",
94
+ "errors": {
95
+ "title": {
96
+ "gold": "yoroiden samurai troopers blu-ray box eizou tokuten - tv",
97
+ "pred": "yoroiden samurai troopers blu-ray box eizou tokuten tv"
98
+ }
99
+ },
100
+ "gold": {
101
+ "group": "QTS",
102
+ "title": "Yoroiden Samurai Troopers Blu-ray BOX Eizou Tokuten - TV",
103
+ "season": null,
104
+ "episode": null,
105
+ "resolution": "960x720",
106
+ "source": "BD",
107
+ "special": "NCOP2"
108
+ },
109
+ "pred": {
110
+ "group": "QTS",
111
+ "title": "Yoroiden Samurai Troopers Blu-ray BOX Eizou Tokuten TV",
112
+ "season": null,
113
+ "episode": null,
114
+ "resolution": "960x720",
115
+ "source": "BD",
116
+ "special": "NCOP2"
117
+ }
118
+ },
119
+ {
120
+ "filename": "[sergey_krs] Shomin Sample - Blu-ray&DVD CM2 [BDRip 1920x1080 x264 FLAC]",
121
+ "errors": {
122
+ "title": {
123
+ "gold": "shomin sample - blu-ray&",
124
+ "pred": "shomin sample - blu-ray"
125
+ }
126
+ },
127
+ "gold": {
128
+ "group": "sergey_krs",
129
+ "title": "Shomin Sample - Blu-ray&",
130
+ "season": null,
131
+ "episode": null,
132
+ "resolution": "1920x1080",
133
+ "source": "DVD",
134
+ "special": "CM2"
135
+ },
136
+ "pred": {
137
+ "group": "sergey_krs",
138
+ "title": "Shomin Sample - Blu-ray",
139
+ "season": null,
140
+ "episode": null,
141
+ "resolution": "1920x1080",
142
+ "source": "DVD",
143
+ "special": "CM2"
144
+ }
145
+ },
146
+ {
147
+ "filename": "NCOP Ver.1",
148
  "errors": {
 
 
 
 
149
  "special": {
150
+ "gold": "ncop",
151
+ "pred": "1"
152
  }
153
  },
154
  "gold": {
155
  "group": null,
156
+ "title": "Ver",
157
  "season": null,
158
  "episode": null,
159
  "resolution": null,
160
+ "source": null,
161
+ "special": "NCOP"
162
  },
163
  "pred": {
164
  "group": null,
165
+ "title": "Ver",
166
  "season": null,
167
  "episode": null,
168
  "resolution": null,
169
  "source": null,
170
+ "special": "1"
171
  }
172
  },
173
  {
174
+ "filename": "38-ガンダムビルドダイバーズ ReRISE S2_ED1",
175
  "errors": {
176
+ "title": {
177
+ "gold": "38-ガンダムビルドダイバーズ rerise",
178
+ "pred": "ガンダムビルドダイバーズ rerise"
179
  },
180
+ "episode": {
181
+ "gold": null,
182
+ "pred": "38"
183
  }
184
  },
185
  "gold": {
186
  "group": null,
187
+ "title": "38-ガンダムビルドダイバーズ ReRISE",
188
+ "season": 2,
189
  "episode": null,
190
  "resolution": null,
191
+ "source": null,
192
+ "special": "ED1"
193
  },
194
  "pred": {
195
  "group": null,
196
+ "title": "ガンダムビルドダイバーズ ReRISE",
197
+ "season": 2,
198
+ "episode": 38,
199
  "resolution": null,
200
  "source": null,
201
+ "special": "ED1"
202
  }
203
  },
204
  {
205
+ "filename": "[DBD-Raws][Blood-C][剧场版][The Last Dark][PV][01][1080P][BDRip][HEVC-10bit][FLAC]",
206
+ "errors": {
207
+ "title": {
208
+ "gold": "blood-c",
209
+ "pred": "blood-c t"
210
+ }
211
+ },
212
+ "gold": {
213
+ "group": "DBD-Raws",
214
+ "title": "Blood-C",
215
+ "season": null,
216
+ "episode": null,
217
+ "resolution": "1080P",
218
+ "source": "BDRip",
219
+ "special": "01"
220
+ },
221
+ "pred": {
222
+ "group": "DBD-Raws",
223
+ "title": "Blood-C T",
224
+ "season": null,
225
+ "episode": null,
226
+ "resolution": "1080P",
227
+ "source": "BDRip",
228
+ "special": "01"
229
+ }
230
+ },
231
+ {
232
+ "filename": "[Moozzi2] Zegapain [SP02] NCED - 01 (Ver.1 #02~05) (BD 1920x1080 x.264 Flac)",
233
  "errors": {
 
 
 
 
234
  "title": {
235
+ "gold": "moozzi2 zegapain sp02 nced 01 ver 1 #02",
236
+ "pred": "moozzi2 zegapain sp02 nced 01 ver 1 #"
237
  },
238
  "episode": {
239
+ "gold": "5",
240
+ "pred": "2"
241
  }
242
  },
243
  "gold": {
244
+ "group": null,
245
+ "title": "Moozzi2 Zegapain SP02 NCED 01 Ver 1 #02",
246
  "season": null,
247
+ "episode": 5,
248
+ "resolution": "1920x1080",
249
+ "source": "BD",
250
  "special": null
251
  },
252
  "pred": {
253
  "group": null,
254
+ "title": "Moozzi2 Zegapain SP02 NCED 01 Ver 1 #",
255
  "season": null,
256
+ "episode": 2,
257
+ "resolution": "1920x1080",
258
+ "source": "BD",
259
  "special": null
260
  }
261
  },
262
  {
263
+ "filename": "Kyou_kara_Ma_Ou(TXXZ)_ZJ02",
264
  "errors": {
 
 
 
 
265
  "title": {
266
+ "gold": "kyou_kara_ma_ou txxz zj02",
267
+ "pred": "kyou_kara_ma_ou(txxz zj02"
268
  }
269
  },
270
  "gold": {
271
+ "group": null,
272
+ "title": "Kyou_kara_Ma_Ou TXXZ ZJ02",
273
  "season": null,
274
  "episode": null,
275
  "resolution": null,
 
278
  },
279
  "pred": {
280
  "group": null,
281
+ "title": "Kyou_kara_Ma_Ou(TXXZ ZJ02",
282
  "season": null,
283
  "episode": null,
284
  "resolution": null,
 
287
  }
288
  },
289
  {
290
+ "filename": "Yuusha ni Narenakatta Ore wa Shibushibu Shuushoku o Ketsui Shimashita. - Vol.01 CM_02 (BD 1280x720 AVC AAC)",
291
  "errors": {
292
  "title": {
293
+ "gold": "yuusha ni narenakatta ore wa shibushibu shuushoku o ketsui shimashita. - vol.01",
294
+ "pred": "yuusha ni narenakatta ore wa shibushibu shuushoku o ketsui shimashita. - vol 1"
295
  }
296
  },
297
  "gold": {
298
+ "group": null,
299
+ "title": "Yuusha ni Narenakatta Ore wa Shibushibu Shuushoku o Ketsui Shimashita. - Vol.01",
300
  "season": null,
301
  "episode": null,
302
+ "resolution": "1280x720",
303
+ "source": "BD",
304
+ "special": "CM_02"
305
  },
306
  "pred": {
307
+ "group": null,
308
+ "title": "Yuusha ni Narenakatta Ore wa Shibushibu Shuushoku o Ketsui Shimashita. - Vol 1",
309
  "season": null,
310
  "episode": null,
311
+ "resolution": "1280x720",
312
+ "source": "BD",
313
+ "special": "CM_02"
314
  }
315
  },
316
  {
317
+ "filename": "[Airota][Carole and Tuesday][ep09][song 04 Move Mountains-Angela]",
318
  "errors": {
 
 
 
 
319
  "special": {
320
+ "gold": null,
321
+ "pred": "move"
322
  }
323
  },
324
  "gold": {
325
+ "group": "Airota",
326
+ "title": "Carole and Tuesday",
327
  "season": null,
328
+ "episode": 9,
329
  "resolution": null,
330
+ "source": null,
331
+ "special": null
332
  },
333
  "pred": {
334
+ "group": "Airota",
335
+ "title": "Carole and Tuesday",
336
  "season": null,
337
+ "episode": 9,
338
  "resolution": null,
339
  "source": null,
340
+ "special": "Move"
341
  }
342
  },
343
  {
344
+ "filename": "モンキーターンV Vol 07 映像特典 「バリエーション映像 (1280x960 H 264/AVC AAC) [アニメ DVD][東京国際アニメフェア2005用PV][e4c3a095]",
345
  "errors": {
346
+ "title": {
347
+ "gold": "モンキーターンv vol 07 映像特典 「バリエーション映像 」",
348
+ "pred": "モンキーターンv vol 07 映像特典 「バリエーション映像 」 ニメ d 東京国際アニメフェア2005用"
349
  },
350
+ "source": {
351
+ "gold": "dvd",
352
+ "pred": "vd"
353
  }
354
  },
355
  "gold": {
356
  "group": null,
357
+ "title": "モンキーターンV Vol 07 映像特典 「バリエーション映像 」",
358
  "season": null,
359
  "episode": null,
360
+ "resolution": "1280x960",
361
+ "source": "DVD",
362
+ "special": "PV"
363
  },
364
  "pred": {
365
  "group": null,
366
+ "title": "モンキーターンV Vol 07 映像特典 「バリエーション映像 」 ニメ D 東京国際アニメフェア2005用",
367
  "season": null,
368
  "episode": null,
369
+ "resolution": "1280x960",
370
+ "source": "VD",
371
+ "special": "PV"
372
  }
373
  },
374
  {
375
+ "filename": "[VCB-Studio] Mob Psycho 100 III [CM02][Ma10p_1080p][x265_flac]",
376
  "errors": {
377
  "season": {
378
  "gold": null,
379
+ "pred": "3"
380
  }
381
  },
382
  "gold": {
 
386
  "episode": 100,
387
  "resolution": "1080p",
388
  "source": "x265-flac",
389
+ "special": "CM02"
390
  },
391
  "pred": {
392
  "group": "VCB-Studio",
393
  "title": "Mob Psycho",
394
+ "season": 3,
395
  "episode": 100,
396
  "resolution": "1080p",
397
  "source": "x265-flac",
398
+ "special": "CM02"
399
  }
400
  },
401
  {
402
+ "filename": "[ReinForce] Kiss×sis - ED ep.12 (BDRip 1920x1080 x264 FLAC)",
403
  "errors": {
 
 
 
 
 
 
 
 
404
  "special": {
405
+ "gold": "ed",
406
+ "pred": "e"
407
  }
408
  },
409
  "gold": {
410
+ "group": "ReinForce",
411
+ "title": "Kiss×sis",
412
  "season": null,
413
  "episode": null,
414
+ "resolution": "1920x1080",
415
+ "source": "BDRip",
416
+ "special": "ED"
417
  },
418
  "pred": {
419
+ "group": "ReinForce",
420
+ "title": "Kiss×sis",
421
  "season": null,
422
  "episode": null,
423
+ "resolution": "1920x1080",
424
+ "source": "BDRip",
425
+ "special": "e"
426
  }
427
  },
428
  {
429
+ "filename": "[NAOKI-Raws] BD-BOX3 Disc.4 Menu (BDRip x264 DTS-HDMA)",
430
  "errors": {
 
 
 
 
431
  "title": {
432
+ "gold": "box",
433
+ "pred": null
434
  }
435
  },
436
  "gold": {
437
+ "group": "NAOKI-Raws",
438
+ "title": "BOX",
439
  "season": null,
440
  "episode": null,
441
+ "resolution": null,
442
+ "source": "BD",
443
  "special": null
444
  },
445
  "pred": {
446
+ "group": "NAOKI-Raws",
447
+ "title": null,
448
  "season": null,
449
  "episode": null,
450
+ "resolution": null,
451
+ "source": "BD",
452
  "special": null
453
  }
454
  },
455
  {
456
+ "filename": "[SZW] Strike the Blood IV OVA 02 [BDRpi][1080P][x.264 AAC][CHS-IN]",
457
  "errors": {
458
  "title": {
459
+ "gold": "strike the blood iv",
460
+ "pred": "strike the blood"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
461
  },
462
  "special": {
463
+ "gold": "ova",
464
+ "pred": "02"
465
  }
466
  },
467
  "gold": {
468
+ "group": "SZW",
469
+ "title": "Strike the Blood IV",
470
  "season": null,
471
  "episode": null,
472
+ "resolution": "1080P",
473
+ "source": "CHS",
474
+ "special": "OVA"
475
  },
476
  "pred": {
477
+ "group": "SZW",
478
+ "title": "Strike the Blood",
479
  "season": null,
480
  "episode": null,
481
+ "resolution": "1080P",
482
+ "source": "CHS",
483
+ "special": "02"
484
  }
485
  },
486
  {
487
+ "filename": "Toradora! CM28 [BD 720p 23.976fps AVC-yuv420p10 FLAC] - VCB-Studio & mawen1250",
488
  "errors": {
489
+ "title": {
490
+ "gold": "toradora! vcb-studio & mawen",
491
+ "pred": "toradora! 2 fp vcb-studio & mawen"
 
 
 
 
492
  }
493
  },
494
  "gold": {
495
  "group": null,
496
+ "title": "Toradora! VCB-Studio & mawen",
497
  "season": null,
498
+ "episode": 1250,
499
+ "resolution": "720p",
500
+ "source": "BD",
501
+ "special": "CM28"
502
  },
503
  "pred": {
504
  "group": null,
505
+ "title": "Toradora! 2 fp VCB-Studio & mawen",
506
  "season": null,
507
+ "episode": 1250,
508
+ "resolution": "720p",
509
+ "source": "BD",
510
+ "special": "CM28"
511
  }
512
  },
513
  {
514
+ "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #01 (BD HEVC 1920x1080 FLAC).mkv",
515
  "errors": {
516
  "title": {
517
+ "gold": "炎炎の消防隊",
518
+ "pred": "炎炎の消防隊 ノ章"
 
 
 
 
519
  }
520
  },
521
  "gold": {
522
+ "group": "AI-Raws",
523
+ "title": "炎炎の消防隊",
524
+ "season": 2,
525
+ "episode": 1,
526
+ "resolution": "1920x1080",
527
+ "source": "BD",
528
+ "special": null
529
  },
530
  "pred": {
531
+ "group": "AI-Raws",
532
+ "title": "炎炎の消防隊 ノ章",
533
+ "season": 2,
534
+ "episode": 1,
535
+ "resolution": "1920x1080",
536
+ "source": "BD",
537
  "special": null
538
  }
539
  },
540
  {
541
+ "filename": "[DBD-Raws][Sword Art Online II][menu][D4][02][1080P][BDRip][HEVC-10bit][FLAC]",
542
  "errors": {
543
  "title": {
544
+ "gold": "sword art online ii menu d4",
545
+ "pred": "sword art online menu d4"
546
+ },
547
+ "season": {
548
+ "gold": null,
549
+ "pred": "2"
550
  }
551
  },
552
  "gold": {
553
+ "group": "DBD-Raws",
554
+ "title": "Sword Art Online II menu D4",
555
  "season": null,
556
+ "episode": 2,
557
  "resolution": "1080P",
558
+ "source": "BDRip",
559
  "special": null
560
  },
561
  "pred": {
562
+ "group": "DBD-Raws",
563
+ "title": "Sword Art Online menu D4",
564
+ "season": 2,
565
+ "episode": 2,
566
  "resolution": "1080P",
567
+ "source": "BDRip",
568
  "special": null
569
  }
570
  },
571
  {
572
+ "filename": "[ZA].Saint.Seiya.Episode.038.[X264.Aac(Jpn-Fre).Sub(Fre-Misc).Chap]",
573
  "errors": {
 
 
 
 
574
  "title": {
575
+ "gold": "saint.seiya.episode.038 sub(fre-misc).chap",
576
+ "pred": "saint.seiya.episode.038 sub fre-misc chap"
577
  }
578
  },
579
  "gold": {
580
+ "group": "ZA",
581
+ "title": "Saint.Seiya.Episode.038 Sub(Fre-Misc).Chap",
582
  "season": null,
583
  "episode": null,
584
  "resolution": null,
585
+ "source": "Aac(Jpn-Fre)",
586
+ "special": null
587
  },
588
  "pred": {
589
+ "group": "ZA",
590
+ "title": "Saint.Seiya.Episode.038 Sub Fre-Misc Chap",
591
  "season": null,
592
  "episode": null,
593
  "resolution": null,
594
+ "source": "Aac(Jpn-Fre)",
595
+ "special": null
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
596
  }
597
  }
598
  ]
 
601
  "constrain_bio": true,
602
  "sample_count": 2048,
603
  "field_accuracy": {
604
+ "group": 0.99951171875,
605
+ "title": 0.97998046875,
606
+ "season": 0.994140625,
607
+ "episode": 0.99609375,
608
  "resolution": 1.0,
609
+ "source": 0.9970703125,
610
+ "special": 0.99365234375
611
  },
612
  "field_correct": {
613
+ "group": 2047,
614
+ "title": 2007,
615
+ "season": 2036,
616
+ "episode": 2040,
617
  "resolution": 2048,
618
+ "source": 2042,
619
+ "special": 2035
620
  },
621
  "field_total": {
622
  "group": 2048,
 
627
  "source": 2048,
628
  "special": 2048
629
  },
630
+ "full_match_accuracy": 0.970703125,
631
+ "full_match_correct": 1988,
632
  "full_match_total": 2048,
633
  "failures": [
634
  {
635
+ "filename": "[sergey_krs] Shomin Sample - Blu-ray&DVD CM2 [BDRip 1920x1080 x264 FLAC]",
636
  "errors": {
637
  "title": {
638
+ "gold": "shomin sample - blu-ray&",
639
+ "pred": "shomin sample - blu-ray"
640
+ }
641
+ },
642
+ "gold": {
643
+ "group": "sergey_krs",
644
+ "title": "Shomin Sample - Blu-ray&",
645
+ "season": null,
646
+ "episode": null,
647
+ "resolution": "1920x1080",
648
+ "source": "DVD",
649
+ "special": "CM2"
650
+ },
651
+ "pred": {
652
+ "group": "sergey_krs",
653
+ "title": "Shomin Sample - Blu-ray",
654
+ "season": null,
655
+ "episode": null,
656
+ "resolution": "1920x1080",
657
+ "source": "DVD",
658
+ "special": "CM2"
659
+ }
660
+ },
661
+ {
662
+ "filename": "38-ガンダムビルドダイバーズ ReRISE S2_ED1",
663
+ "errors": {
664
+ "title": {
665
+ "gold": "38-ガンダムビルドダイバーズ rerise",
666
+ "pred": "ガンダムビルドダイバーズ rerise"
667
  },
668
  "episode": {
669
  "gold": null,
670
+ "pred": "38"
671
  }
672
  },
673
  "gold": {
674
  "group": null,
675
+ "title": "38-ガンダムビルドダイバーズ ReRISE",
676
+ "season": 2,
677
  "episode": null,
678
  "resolution": null,
679
+ "source": null,
680
+ "special": "ED1"
681
  },
682
  "pred": {
683
  "group": null,
684
+ "title": "ガンダムビルドダイバーズ ReRISE",
685
+ "season": 2,
686
+ "episode": 38,
687
  "resolution": null,
688
+ "source": null,
689
+ "special": "ED1"
690
  }
691
  },
692
  {
693
+ "filename": "[DBD-Raws][Blood-C][剧场版][The Last Dark][PV][01][1080P][BDRip][HEVC-10bit][FLAC]",
694
  "errors": {
695
  "title": {
696
+ "gold": "blood-c",
697
+ "pred": "blood-c t"
 
 
 
 
698
  }
699
  },
700
  "gold": {
701
+ "group": "DBD-Raws",
702
+ "title": "Blood-C",
703
  "season": null,
704
  "episode": null,
705
+ "resolution": "1080P",
706
+ "source": "BDRip",
707
+ "special": "01"
708
  },
709
  "pred": {
710
+ "group": "DBD-Raws",
711
+ "title": "Blood-C T",
712
  "season": null,
713
  "episode": null,
714
+ "resolution": "1080P",
715
+ "source": "BDRip",
716
+ "special": "01"
717
  }
718
  },
719
  {
720
+ "filename": "Kyou_kara_Ma_Ou(TXXZ)_ZJ02",
721
  "errors": {
722
+ "title": {
723
+ "gold": "kyou_kara_ma_ou txxz zj02",
724
+ "pred": "kyou_kara_ma_ou(txxz zj02"
 
 
 
 
725
  }
726
  },
727
  "gold": {
728
  "group": null,
729
+ "title": "Kyou_kara_Ma_Ou TXXZ ZJ02",
730
  "season": null,
731
  "episode": null,
732
  "resolution": null,
733
+ "source": null,
734
+ "special": null
735
  },
736
  "pred": {
737
  "group": null,
738
+ "title": "Kyou_kara_Ma_Ou(TXXZ ZJ02",
739
  "season": null,
740
  "episode": null,
741
  "resolution": null,
742
  "source": null,
743
+ "special": null
744
  }
745
  },
746
  {
747
+ "filename": "[Airota][Carole and Tuesday][ep09][song 04 Move Mountains-Angela]",
748
  "errors": {
 
 
 
 
749
  "special": {
750
+ "gold": null,
751
+ "pred": "move"
752
  }
753
  },
754
  "gold": {
755
+ "group": "Airota",
756
+ "title": "Carole and Tuesday",
757
  "season": null,
758
+ "episode": 9,
759
  "resolution": null,
760
+ "source": null,
761
+ "special": null
762
  },
763
  "pred": {
764
+ "group": "Airota",
765
+ "title": "Carole and Tuesday",
766
  "season": null,
767
+ "episode": 9,
768
  "resolution": null,
769
  "source": null,
770
+ "special": "Move"
771
  }
772
  },
773
  {
774
+ "filename": "モンキーターンV Vol 07 映像特典 「バリエーション映像 」 (1280x960 H 264/AVC AAC) [アニメ DVD][東京国際アニメフェア2005用PV][e4c3a095]",
775
  "errors": {
 
 
 
 
776
  "title": {
777
+ "gold": "モンキーターンv vol 07 映像特典 「バリエーション映像 」",
778
+ "pred": "モンキーターンv vol 07 映像特典 「バリエーション映像 」 アニメ dvd 東京国際アニメフェア2005用"
779
  },
780
+ "source": {
781
+ "gold": "dvd",
782
+ "pred": "avc"
783
  }
784
  },
785
  "gold": {
786
+ "group": null,
787
+ "title": "モンキーターンV Vol 07 映像特典 「バリエーション映像 」",
788
  "season": null,
789
+ "episode": null,
790
+ "resolution": "1280x960",
791
+ "source": "DVD",
792
+ "special": "PV"
793
  },
794
  "pred": {
795
  "group": null,
796
+ "title": "モンキーターンV Vol 07 映像特典 「バリエーション映像 」 アニメ DVD 東京国際アニメフェア2005用",
797
  "season": null,
798
  "episode": null,
799
+ "resolution": "1280x960",
800
+ "source": "AVC",
801
+ "special": "PV"
802
  }
803
  },
804
  {
805
+ "filename": "[VCB-Studio] Mob Psycho 100 III [CM02][Ma10p_1080p][x265_flac]",
806
  "errors": {
807
+ "season": {
808
+ "gold": null,
809
+ "pred": "3"
 
 
 
 
810
  }
811
  },
812
  "gold": {
813
+ "group": "VCB-Studio",
814
+ "title": "Mob Psycho",
815
  "season": null,
816
+ "episode": 100,
817
+ "resolution": "1080p",
818
+ "source": "x265-flac",
819
+ "special": "CM02"
820
  },
821
  "pred": {
822
+ "group": "VCB-Studio",
823
+ "title": "Mob Psycho",
824
+ "season": 3,
825
+ "episode": 100,
826
+ "resolution": "1080p",
827
+ "source": "x265-flac",
828
+ "special": "CM02"
829
  }
830
  },
831
  {
832
+ "filename": "[ReinForce] Kiss×sis - ED ep.12 (BDRip 1920x1080 x264 FLAC)",
833
  "errors": {
 
 
 
 
834
  "special": {
835
+ "gold": "ed",
836
+ "pred": "e"
837
  }
838
  },
839
  "gold": {
840
+ "group": "ReinForce",
841
+ "title": "Kiss×sis",
842
  "season": null,
843
  "episode": null,
844
+ "resolution": "1920x1080",
845
+ "source": "BDRip",
846
+ "special": "ED"
847
  },
848
  "pred": {
849
+ "group": "ReinForce",
850
+ "title": "Kiss×sis",
851
  "season": null,
852
  "episode": null,
853
+ "resolution": "1920x1080",
854
+ "source": "BDRip",
855
+ "special": "e"
856
  }
857
  },
858
  {
859
+ "filename": "[NAOKI-Raws] BD-BOX3 Disc.4 Menu (BDRip x264 DTS-HDMA)",
860
  "errors": {
861
+ "title": {
862
+ "gold": "box",
863
  "pred": null
 
 
 
 
864
  }
865
  },
866
  "gold": {
867
+ "group": "NAOKI-Raws",
868
+ "title": "BOX",
869
  "season": null,
870
  "episode": null,
871
  "resolution": null,
872
  "source": "BD",
873
+ "special": null
874
  },
875
  "pred": {
876
+ "group": "NAOKI-Raws",
877
+ "title": null,
878
  "season": null,
879
  "episode": null,
880
  "resolution": null,
881
+ "source": "BD",
882
+ "special": null
883
  }
884
  },
885
  {
886
+ "filename": "[SZW] Strike the Blood IV OVA 02 [BDRpi][1080P][x.264 AAC][CHS-IN]",
887
  "errors": {
888
+ "title": {
889
+ "gold": "strike the blood iv",
890
+ "pred": "strike the blood"
891
  },
892
  "special": {
893
+ "gold": "ova",
894
+ "pred": "ova 02"
895
  }
896
  },
897
  "gold": {
898
+ "group": "SZW",
899
+ "title": "Strike the Blood IV",
900
  "season": null,
901
  "episode": null,
902
+ "resolution": "1080P",
903
+ "source": "CHS",
904
+ "special": "OVA"
905
  },
906
  "pred": {
907
+ "group": "SZW",
908
+ "title": "Strike the Blood",
909
  "season": null,
910
  "episode": null,
911
+ "resolution": "1080P",
912
+ "source": "CHS",
913
+ "special": "OVA 02"
914
  }
915
  },
916
  {
917
+ "filename": "[DBD-Raws][Sword Art Online II][menu][D4][02][1080P][BDRip][HEVC-10bit][FLAC]",
918
  "errors": {
919
+ "title": {
920
+ "gold": "sword art online ii menu d4",
921
+ "pred": "sword art online menu d4"
922
+ },
923
  "season": {
924
  "gold": null,
925
  "pred": "2"
926
  }
927
  },
928
  "gold": {
929
+ "group": "DBD-Raws",
930
+ "title": "Sword Art Online II menu D4",
931
  "season": null,
932
+ "episode": 2,
933
+ "resolution": "1080P",
934
+ "source": "BDRip",
935
+ "special": null
936
  },
937
  "pred": {
938
+ "group": "DBD-Raws",
939
+ "title": "Sword Art Online menu D4",
940
  "season": 2,
941
+ "episode": 2,
942
+ "resolution": "1080P",
943
+ "source": "BDRip",
944
+ "special": null
945
  }
946
  },
947
  {
948
+ "filename": "[ZA].Saint.Seiya.Episode.038.[X264.Aac(Jpn-Fre).Sub(Fre-Misc).Chap]",
949
  "errors": {
950
  "title": {
951
+ "gold": "saint.seiya.episode.038 sub(fre-misc).chap",
952
+ "pred": "saint.seiya.episode.038 sub fre-misc chap"
 
 
 
 
953
  }
954
  },
955
  "gold": {
956
+ "group": "ZA",
957
+ "title": "Saint.Seiya.Episode.038 Sub(Fre-Misc).Chap",
958
  "season": null,
959
  "episode": null,
960
  "resolution": null,
961
+ "source": "Aac(Jpn-Fre)",
962
+ "special": null
963
  },
964
  "pred": {
965
+ "group": "ZA",
966
+ "title": "Saint.Seiya.Episode.038 Sub Fre-Misc Chap",
967
  "season": null,
968
  "episode": null,
969
  "resolution": null,
970
+ "source": "Aac(Jpn-Fre)",
971
  "special": null
972
  }
973
  },
974
  {
975
+ "filename": "[DVD] ギャラクシーエンジェルAA 第3期 第43-44話 「リトライライス」「食べてはいけないお供え物」(640x480 WMV9)",
976
  "errors": {
977
+ "season": {
978
+ "gold": null,
979
+ "pred": "43"
 
 
 
 
980
  }
981
  },
982
  "gold": {
983
+ "group": null,
984
+ "title": "ギャラクシーエンジェルAA 第3期 第",
985
  "season": null,
986
+ "episode": 44,
987
+ "resolution": "640x480",
988
+ "source": "DVD",
989
  "special": null
990
  },
991
  "pred": {
992
+ "group": null,
993
+ "title": "ギャラクシーエンジェルAA 第3期 第",
994
+ "season": 43,
995
+ "episode": 44,
996
+ "resolution": "640x480",
997
+ "source": "DVD",
998
  "special": null
999
  }
1000
  },
1001
  {
1002
+ "filename": "[DMHY_RAINS] The Irregular At Magic High School Blu-ray&DVD CM02 15sec ver (BDrip 1920x1080 AVC-YUV420P10 FLAC)",
1003
  "errors": {
1004
+ "title": {
1005
+ "gold": "the irregular at magic high school blu-ray&",
1006
+ "pred": "the irregular at magic high school blu-ray"
 
 
 
 
1007
  }
1008
  },
1009
  "gold": {
1010
+ "group": "DMHY_RAINS",
1011
+ "title": "The Irregular At Magic High School Blu-ray&",
1012
  "season": null,
1013
  "episode": null,
1014
+ "resolution": "1920x1080",
1015
+ "source": "DVD",
1016
+ "special": "CM02"
1017
  },
1018
  "pred": {
1019
+ "group": "DMHY_RAINS",
1020
+ "title": "The Irregular At Magic High School Blu-ray",
1021
  "season": null,
1022
  "episode": null,
1023
+ "resolution": "1920x1080",
1024
+ "source": "DVD",
1025
+ "special": "CM02"
1026
  }
1027
  },
1028
  {
1029
+ "filename": "[sergey_krs] Shomin Sample - Blu-ray&DVD CM1 [BDRip 1920x1080 x264 FLAC]",
1030
  "errors": {
1031
+ "title": {
1032
+ "gold": "shomin sample - blu-ray&",
1033
+ "pred": "shomin sample - blu-ray"
 
 
 
 
1034
  }
1035
  },
1036
  "gold": {
1037
+ "group": "sergey_krs",
1038
+ "title": "Shomin Sample - Blu-ray&",
1039
  "season": null,
1040
  "episode": null,
1041
+ "resolution": "1920x1080",
1042
+ "source": "DVD",
1043
+ "special": "CM1"
1044
  },
1045
  "pred": {
1046
+ "group": "sergey_krs",
1047
+ "title": "Shomin Sample - Blu-ray",
1048
  "season": null,
1049
+ "episode": null,
1050
+ "resolution": "1920x1080",
1051
+ "source": "DVD",
1052
+ "special": "CM1"
1053
  }
1054
  },
1055
  {
1056
+ "filename": "[Nekomoe kissaten] Oozora no Takeoff Girls! [Memorial PV][05][BDRip 1080p HEVC-10bit FLAC]",
1057
  "errors": {
1058
  "title": {
1059
+ "gold": "oozora no takeoff girls! memorial pv",
1060
+ "pred": "oozora no takeoff girls! memorial"
1061
  },
1062
+ "episode": {
1063
+ "gold": "5",
1064
  "pred": null
1065
+ },
1066
+ "special": {
1067
+ "gold": null,
1068
+ "pred": "05"
1069
  }
1070
  },
1071
  "gold": {
1072
+ "group": "Nekomoe kissaten",
1073
+ "title": "Oozora no Takeoff Girls! Memorial PV",
1074
  "season": null,
1075
+ "episode": 5,
1076
+ "resolution": "1080p",
1077
+ "source": "BDRip",
1078
+ "special": null
1079
  },
1080
  "pred": {
1081
+ "group": "Nekomoe kissaten",
1082
+ "title": "Oozora no Takeoff Girls! Memorial",
1083
  "season": null,
1084
+ "episode": null,
1085
+ "resolution": "1080p",
1086
+ "source": "BDRip",
1087
+ "special": "05"
1088
  }
1089
  },
1090
  {
1091
+ "filename": "[Koten_Gars] Ys II, Castle In The Sky - 02 [DVD][Hi10][480p][AC3] [D2496CD3]",
1092
  "errors": {
1093
  "title": {
1094
+ "gold": "koten_gars ys ii castle in the sky 02 dvd hi",
1095
+ "pred": "koten_gars ys castle in the sky 02 dvd hi"
1096
+ },
1097
+ "season": {
1098
+ "gold": null,
1099
+ "pred": "2"
1100
  }
1101
  },
1102
  "gold": {
1103
+ "group": null,
1104
+ "title": "Koten_Gars Ys II Castle In The Sky 02 DVD Hi",
1105
  "season": null,
1106
+ "episode": 10,
1107
+ "resolution": "480p",
1108
+ "source": null,
1109
  "special": null
1110
  },
1111
  "pred": {
1112
+ "group": null,
1113
+ "title": "Koten_Gars Ys Castle In The Sky 02 DVD Hi",
1114
+ "season": 2,
1115
+ "episode": 10,
1116
+ "resolution": "480p",
1117
+ "source": null,
1118
  "special": null
1119
  }
1120
  },
1121
  {
1122
+ "filename": "[SFEO-Raws] GINTAMA - NCED2_55 (BD 720P x264 10bit AAC)[C9868FD8]",
1123
  "errors": {
1124
+ "season": {
1125
+ "gold": "2",
1126
+ "pred": null
1127
+ },
1128
  "episode": {
1129
+ "gold": "55",
1130
+ "pred": null
1131
  },
1132
  "special": {
1133
+ "gold": "nced",
1134
+ "pred": "nced2_55"
1135
  }
1136
  },
1137
  "gold": {
1138
+ "group": "SFEO-Raws",
1139
+ "title": "GINTAMA",
1140
  "season": 2,
1141
+ "episode": 55,
1142
+ "resolution": "720P",
1143
+ "source": "BD",
1144
+ "special": "NCED"
1145
  },
1146
  "pred": {
1147
+ "group": "SFEO-Raws",
1148
+ "title": "GINTAMA",
1149
+ "season": null,
1150
+ "episode": null,
1151
+ "resolution": "720P",
1152
+ "source": "BD",
1153
+ "special": "NCED2_55"
1154
  }
1155
  },
1156
  {
1157
+ "filename": "[SFEO-Raws] GINTAMA - NCED2_42 (BD 720P x264 10bit AAC)[ED2AB5E0]",
1158
  "errors": {
1159
+ "season": {
1160
+ "gold": "2",
1161
+ "pred": null
 
 
 
 
1162
  }
1163
  },
1164
  "gold": {
1165
+ "group": "SFEO-Raws",
1166
+ "title": "GINTAMA",
1167
+ "season": 2,
1168
+ "episode": 42,
1169
+ "resolution": "720P",
1170
+ "source": "BD",
1171
+ "special": "ED"
1172
  },
1173
  "pred": {
1174
+ "group": "SFEO-Raws",
1175
+ "title": "GINTAMA",
1176
  "season": null,
1177
+ "episode": 42,
1178
+ "resolution": "720P",
1179
+ "source": "BD",
1180
+ "special": "ED"
1181
  }
1182
  },
1183
  {
1184
+ "filename": "Slime Taoshite 300-nen S2 - OP",
1185
  "errors": {
1186
  "title": {
1187
+ "gold": "slime taoshite",
1188
+ "pred": "slime taoshite 300-nen"
1189
+ },
1190
+ "episode": {
1191
+ "gold": "300",
1192
+ "pred": null
1193
  }
1194
  },
1195
  "gold": {
1196
  "group": null,
1197
+ "title": "Slime Taoshite",
1198
+ "season": 2,
1199
+ "episode": 300,
1200
  "resolution": null,
1201
  "source": null,
1202
+ "special": "OP"
1203
  },
1204
  "pred": {
1205
  "group": null,
1206
+ "title": "Slime Taoshite 300-nen",
1207
+ "season": 2,
1208
  "episode": null,
1209
  "resolution": null,
1210
  "source": null,
1211
+ "special": "OP"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1212
  }
1213
  }
1214
  ]
reports/perf_metrics.json CHANGED
@@ -1,2070 +1,533 @@
1
  {
2
- "sample_count": 6,
3
  "samples": [
4
  {
5
  "step": 50.0,
6
- "elapsed_seconds": 11.341162500000792,
7
- "window_seconds": 11.341162500000792,
8
- "steps_per_second": 4.408719123810854,
9
- "samples_per_second": 7900.4246698690495,
10
- "tokens_per_second": 1011254.3577432383,
11
- "process_rss_mb": 3596.62890625,
12
- "cuda_allocated_mb": 122.5029296875,
13
- "cuda_reserved_mb": 12124.0,
14
- "cuda_max_allocated_mb": 11120.271484375,
15
- "cuda_max_reserved_mb": 12124.0,
16
- "gpu_util_percent": 65.0,
17
- "gpu_memory_util_percent": 49.0,
18
- "gpu_memory_used_mb": 13350.20703125,
19
- "gpu_memory_total_mb": 16303.0,
20
- "gpu_temperature_c": 57.0,
21
- "gpu_power_w": 214.297
22
- },
23
- {
24
- "step": 100.0,
25
- "elapsed_seconds": 22.133552699990105,
26
- "window_seconds": 10.792390199989313,
27
- "steps_per_second": 4.632894018236064,
28
- "samples_per_second": 8302.146080679026,
29
- "tokens_per_second": 1062674.6983269153,
30
- "process_rss_mb": 3596.62890625,
31
- "cuda_allocated_mb": 122.5029296875,
32
- "cuda_reserved_mb": 12124.0,
33
- "cuda_max_allocated_mb": 11120.271484375,
34
- "cuda_max_reserved_mb": 12124.0,
35
- "gpu_util_percent": 65.0,
36
- "gpu_memory_util_percent": 48.0,
37
- "gpu_memory_used_mb": 13350.20703125,
38
- "gpu_memory_total_mb": 16303.0,
39
- "gpu_temperature_c": 60.0,
40
- "gpu_power_w": 216.032
41
- },
42
- {
43
- "step": 150.0,
44
- "elapsed_seconds": 32.92555089999223,
45
- "window_seconds": 10.791998200002126,
46
- "steps_per_second": 4.633062299805624,
47
- "samples_per_second": 8302.447641251678,
48
- "tokens_per_second": 1062713.2980802148,
49
- "process_rss_mb": 3612.890625,
50
  "cuda_allocated_mb": 122.5029296875,
51
  "cuda_reserved_mb": 12124.0,
52
  "cuda_max_allocated_mb": 11120.271484375,
53
  "cuda_max_reserved_mb": 12124.0,
54
  "gpu_util_percent": 100.0,
55
- "gpu_memory_util_percent": 77.0,
56
- "gpu_memory_used_mb": 13350.20703125,
57
- "gpu_memory_total_mb": 16303.0,
58
- "gpu_temperature_c": 68.0,
59
- "gpu_power_w": 224.4
60
- },
61
- {
62
- "step": 200.0,
63
- "elapsed_seconds": 43.340594499983126,
64
- "window_seconds": 10.415043599990895,
65
- "steps_per_second": 4.800748025677368,
66
- "samples_per_second": 8602.940462013845,
67
- "tokens_per_second": 1101176.3791377721,
68
- "process_rss_mb": 3651.80078125,
69
- "cuda_allocated_mb": 122.5029296875,
70
- "cuda_reserved_mb": 12124.0,
71
- "cuda_max_allocated_mb": 11120.271484375,
72
- "cuda_max_reserved_mb": 12124.0,
73
- "gpu_util_percent": 64.0,
74
- "gpu_memory_util_percent": 48.0,
75
- "gpu_memory_used_mb": 13350.20703125,
76
- "gpu_memory_total_mb": 16303.0,
77
- "gpu_temperature_c": 64.0,
78
- "gpu_power_w": 219.672
79
- },
80
- {
81
- "step": 250.0,
82
- "elapsed_seconds": 53.830062900000485,
83
- "window_seconds": 10.489468400017358,
84
- "steps_per_second": 4.766685793144413,
85
- "samples_per_second": 8541.900941314789,
86
- "tokens_per_second": 1093363.320488293,
87
- "process_rss_mb": 3651.80078125,
88
- "cuda_allocated_mb": 122.5029296875,
89
- "cuda_reserved_mb": 12124.0,
90
- "cuda_max_allocated_mb": 11120.271484375,
91
- "cuda_max_reserved_mb": 12124.0,
92
- "gpu_util_percent": 63.0,
93
- "gpu_memory_util_percent": 49.0,
94
- "gpu_memory_used_mb": 13350.20703125,
95
- "gpu_memory_total_mb": 16303.0,
96
- "gpu_temperature_c": 66.0,
97
- "gpu_power_w": 221.349
98
- },
99
- {
100
- "step": 300.0,
101
- "elapsed_seconds": 64.34275290000369,
102
- "window_seconds": 10.512690000003204,
103
- "steps_per_second": 4.756156606918378,
104
- "samples_per_second": 8523.032639597734,
105
- "tokens_per_second": 1090948.17786851,
106
- "process_rss_mb": 3651.80078125,
107
- "cuda_allocated_mb": 122.5029296875,
108
- "cuda_reserved_mb": 12124.0,
109
- "cuda_max_allocated_mb": 11120.271484375,
110
- "cuda_max_reserved_mb": 12124.0,
111
- "gpu_util_percent": 66.0,
112
- "gpu_memory_util_percent": 50.0,
113
- "gpu_memory_used_mb": 13350.20703125,
114
  "gpu_memory_total_mb": 16303.0,
115
- "gpu_temperature_c": 66.0,
116
- "gpu_power_w": 224.853
117
  }
118
  ],
119
- "background_sample_count": 135,
120
  "background_samples": [
121
  {
122
- "process_rss_mb": 3310.83203125,
123
- "cuda_allocated_mb": 10644.87158203125,
124
  "cuda_reserved_mb": 11200.0,
125
  "cuda_max_allocated_mb": 11051.73876953125,
126
  "cuda_max_reserved_mb": 11200.0,
127
- "gpu_util_percent": 92.0,
128
- "gpu_memory_util_percent": 49.0,
129
- "gpu_memory_used_mb": 12360.20703125,
130
  "gpu_memory_total_mb": 16303.0,
131
- "gpu_temperature_c": 52.0,
132
- "gpu_power_w": 57.852,
133
- "elapsed_seconds": 0.5343180999916513
134
  },
135
  {
136
- "process_rss_mb": 3596.44921875,
137
  "cuda_allocated_mb": 126.2216796875,
138
  "cuda_reserved_mb": 12124.0,
139
  "cuda_max_allocated_mb": 11120.271484375,
140
  "cuda_max_reserved_mb": 12124.0,
141
- "gpu_util_percent": 100.0,
142
- "gpu_memory_util_percent": 76.0,
143
- "gpu_memory_used_mb": 13350.20703125,
144
  "gpu_memory_total_mb": 16303.0,
145
- "gpu_temperature_c": 57.0,
146
- "gpu_power_w": 103.938,
147
- "elapsed_seconds": 1.0462713000015356
148
  },
149
  {
150
- "process_rss_mb": 3596.62890625,
151
  "cuda_allocated_mb": 141.64453125,
152
  "cuda_reserved_mb": 12124.0,
153
  "cuda_max_allocated_mb": 11120.271484375,
154
  "cuda_max_reserved_mb": 12124.0,
155
- "gpu_util_percent": 65.0,
156
- "gpu_memory_util_percent": 49.0,
157
- "gpu_memory_used_mb": 13350.20703125,
158
  "gpu_memory_total_mb": 16303.0,
159
- "gpu_temperature_c": 52.0,
160
- "gpu_power_w": 176.565,
161
- "elapsed_seconds": 1.5571539000084158
162
  },
163
  {
164
- "process_rss_mb": 3596.62890625,
165
  "cuda_allocated_mb": 126.2216796875,
166
  "cuda_reserved_mb": 12124.0,
167
  "cuda_max_allocated_mb": 11120.271484375,
168
  "cuda_max_reserved_mb": 12124.0,
169
- "gpu_util_percent": 100.0,
170
- "gpu_memory_util_percent": 74.0,
171
- "gpu_memory_used_mb": 13350.20703125,
172
  "gpu_memory_total_mb": 16303.0,
173
  "gpu_temperature_c": 58.0,
174
- "gpu_power_w": 213.648,
175
- "elapsed_seconds": 2.068373599991901
176
  },
177
  {
178
- "process_rss_mb": 3596.62890625,
179
  "cuda_allocated_mb": 141.64453125,
180
  "cuda_reserved_mb": 12124.0,
181
  "cuda_max_allocated_mb": 11120.271484375,
182
  "cuda_max_reserved_mb": 12124.0,
183
- "gpu_util_percent": 66.0,
184
- "gpu_memory_util_percent": 52.0,
185
- "gpu_memory_used_mb": 13350.20703125,
186
  "gpu_memory_total_mb": 16303.0,
187
- "gpu_temperature_c": 56.0,
188
- "gpu_power_w": 215.538,
189
- "elapsed_seconds": 2.5774495000077877
190
  },
191
  {
192
- "process_rss_mb": 3596.62890625,
193
  "cuda_allocated_mb": 126.2216796875,
194
  "cuda_reserved_mb": 12124.0,
195
  "cuda_max_allocated_mb": 11120.271484375,
196
  "cuda_max_reserved_mb": 12124.0,
197
- "gpu_util_percent": 90.0,
198
- "gpu_memory_util_percent": 67.0,
199
- "gpu_memory_used_mb": 13350.20703125,
200
  "gpu_memory_total_mb": 16303.0,
201
- "gpu_temperature_c": 60.0,
202
- "gpu_power_w": 218.962,
203
- "elapsed_seconds": 3.090203700005077
204
  },
205
  {
206
- "process_rss_mb": 3596.62890625,
207
  "cuda_allocated_mb": 141.64453125,
208
  "cuda_reserved_mb": 12124.0,
209
  "cuda_max_allocated_mb": 11120.271484375,
210
  "cuda_max_reserved_mb": 12124.0,
211
- "gpu_util_percent": 90.0,
212
- "gpu_memory_util_percent": 71.0,
213
- "gpu_memory_used_mb": 13350.20703125,
214
  "gpu_memory_total_mb": 16303.0,
215
- "gpu_temperature_c": 60.0,
216
- "gpu_power_w": 216.43,
217
- "elapsed_seconds": 3.596891900000628
218
  },
219
  {
220
- "process_rss_mb": 3596.62890625,
221
  "cuda_allocated_mb": 126.2216796875,
222
  "cuda_reserved_mb": 12124.0,
223
  "cuda_max_allocated_mb": 11120.271484375,
224
  "cuda_max_reserved_mb": 12124.0,
225
- "gpu_util_percent": 66.0,
226
- "gpu_memory_util_percent": 47.0,
227
- "gpu_memory_used_mb": 13350.20703125,
228
  "gpu_memory_total_mb": 16303.0,
229
- "gpu_temperature_c": 60.0,
230
- "gpu_power_w": 214.673,
231
- "elapsed_seconds": 4.114909399999306
232
  },
233
  {
234
- "process_rss_mb": 3596.62890625,
235
  "cuda_allocated_mb": 141.64453125,
236
  "cuda_reserved_mb": 12124.0,
237
  "cuda_max_allocated_mb": 11120.271484375,
238
  "cuda_max_reserved_mb": 12124.0,
239
- "gpu_util_percent": 0.0,
240
- "gpu_memory_util_percent": 2.0,
241
- "gpu_memory_used_mb": 13350.20703125,
242
  "gpu_memory_total_mb": 16303.0,
243
- "gpu_temperature_c": 51.0,
244
- "gpu_power_w": 187.155,
245
- "elapsed_seconds": 4.619414599990705
246
  },
247
  {
248
- "process_rss_mb": 3596.62890625,
249
  "cuda_allocated_mb": 141.64453125,
250
  "cuda_reserved_mb": 12124.0,
251
  "cuda_max_allocated_mb": 11120.271484375,
252
  "cuda_max_reserved_mb": 12124.0,
253
- "gpu_util_percent": 65.0,
254
  "gpu_memory_util_percent": 51.0,
255
- "gpu_memory_used_mb": 13350.20703125,
256
  "gpu_memory_total_mb": 16303.0,
257
- "gpu_temperature_c": 56.0,
258
- "gpu_power_w": 175.609,
259
- "elapsed_seconds": 5.127158199989935
260
  },
261
  {
262
- "process_rss_mb": 3596.62890625,
263
  "cuda_allocated_mb": 141.64453125,
264
  "cuda_reserved_mb": 12124.0,
265
  "cuda_max_allocated_mb": 11120.271484375,
266
  "cuda_max_reserved_mb": 12124.0,
267
- "gpu_util_percent": 92.0,
268
- "gpu_memory_util_percent": 69.0,
269
- "gpu_memory_used_mb": 13350.20703125,
270
  "gpu_memory_total_mb": 16303.0,
271
  "gpu_temperature_c": 61.0,
272
- "gpu_power_w": 191.053,
273
- "elapsed_seconds": 5.6363462999870535
274
- },
275
- {
276
- "process_rss_mb": 3596.62890625,
277
- "cuda_allocated_mb": 141.64453125,
278
- "cuda_reserved_mb": 12124.0,
279
- "cuda_max_allocated_mb": 11120.271484375,
280
- "cuda_max_reserved_mb": 12124.0,
281
- "gpu_util_percent": 84.0,
282
- "gpu_memory_util_percent": 65.0,
283
- "gpu_memory_used_mb": 13350.20703125,
284
- "gpu_memory_total_mb": 16303.0,
285
- "gpu_temperature_c": 62.0,
286
- "gpu_power_w": 220.276,
287
- "elapsed_seconds": 6.146641400002409
288
- },
289
- {
290
- "process_rss_mb": 3596.62890625,
291
- "cuda_allocated_mb": 141.64453125,
292
- "cuda_reserved_mb": 12124.0,
293
- "cuda_max_allocated_mb": 11120.271484375,
294
- "cuda_max_reserved_mb": 12124.0,
295
- "gpu_util_percent": 70.0,
296
- "gpu_memory_util_percent": 52.0,
297
- "gpu_memory_used_mb": 13350.20703125,
298
- "gpu_memory_total_mb": 16303.0,
299
- "gpu_temperature_c": 62.0,
300
- "gpu_power_w": 216.665,
301
- "elapsed_seconds": 6.656054299994139
302
  },
303
  {
304
- "process_rss_mb": 3596.62890625,
305
  "cuda_allocated_mb": 141.64453125,
306
  "cuda_reserved_mb": 12124.0,
307
  "cuda_max_allocated_mb": 11120.271484375,
308
  "cuda_max_reserved_mb": 12124.0,
309
  "gpu_util_percent": 100.0,
310
  "gpu_memory_util_percent": 77.0,
311
- "gpu_memory_used_mb": 13350.20703125,
312
  "gpu_memory_total_mb": 16303.0,
313
- "gpu_temperature_c": 61.0,
314
- "gpu_power_w": 219.437,
315
- "elapsed_seconds": 7.16469489998417
316
  },
317
  {
318
- "process_rss_mb": 3596.62890625,
319
  "cuda_allocated_mb": 141.64453125,
320
  "cuda_reserved_mb": 12124.0,
321
  "cuda_max_allocated_mb": 11120.271484375,
322
  "cuda_max_reserved_mb": 12124.0,
323
- "gpu_util_percent": 66.0,
324
- "gpu_memory_util_percent": 48.0,
325
- "gpu_memory_used_mb": 13350.20703125,
326
  "gpu_memory_total_mb": 16303.0,
327
- "gpu_temperature_c": 61.0,
328
- "gpu_power_w": 216.638,
329
- "elapsed_seconds": 7.6759840000013355
330
  },
331
  {
332
- "process_rss_mb": 3596.62890625,
333
  "cuda_allocated_mb": 141.64453125,
334
  "cuda_reserved_mb": 12124.0,
335
  "cuda_max_allocated_mb": 11120.271484375,
336
  "cuda_max_reserved_mb": 12124.0,
337
  "gpu_util_percent": 100.0,
338
- "gpu_memory_util_percent": 77.0,
339
- "gpu_memory_used_mb": 13350.20703125,
340
  "gpu_memory_total_mb": 16303.0,
341
  "gpu_temperature_c": 62.0,
342
- "gpu_power_w": 216.693,
343
- "elapsed_seconds": 8.185786499991082
344
  },
345
  {
346
- "process_rss_mb": 3596.62890625,
347
- "cuda_allocated_mb": 141.64453125,
348
  "cuda_reserved_mb": 12124.0,
349
  "cuda_max_allocated_mb": 11120.271484375,
350
  "cuda_max_reserved_mb": 12124.0,
351
- "gpu_util_percent": 66.0,
352
- "gpu_memory_util_percent": 50.0,
353
- "gpu_memory_used_mb": 13350.20703125,
354
  "gpu_memory_total_mb": 16303.0,
355
  "gpu_temperature_c": 56.0,
356
- "gpu_power_w": 216.372,
357
- "elapsed_seconds": 8.695240300003206
358
  },
359
  {
360
- "process_rss_mb": 3596.62890625,
361
  "cuda_allocated_mb": 141.64453125,
362
  "cuda_reserved_mb": 12124.0,
363
  "cuda_max_allocated_mb": 11120.271484375,
364
  "cuda_max_reserved_mb": 12124.0,
365
- "gpu_util_percent": 1.0,
366
- "gpu_memory_util_percent": 2.0,
367
- "gpu_memory_used_mb": 13350.20703125,
368
- "gpu_memory_total_mb": 16303.0,
369
- "gpu_temperature_c": 53.0,
370
- "gpu_power_w": 199.724,
371
- "elapsed_seconds": 9.203403499996057
372
- },
373
- {
374
- "process_rss_mb": 3596.62890625,
375
- "cuda_allocated_mb": 126.2216796875,
376
- "cuda_reserved_mb": 12124.0,
377
- "cuda_max_allocated_mb": 11120.271484375,
378
- "cuda_max_reserved_mb": 12124.0,
379
- "gpu_util_percent": 100.0,
380
- "gpu_memory_util_percent": 77.0,
381
- "gpu_memory_used_mb": 13350.20703125,
382
- "gpu_memory_total_mb": 16303.0,
383
- "gpu_temperature_c": 63.0,
384
- "gpu_power_w": 181.048,
385
- "elapsed_seconds": 9.724752899986925
386
- },
387
- {
388
- "process_rss_mb": 3596.62890625,
389
- "cuda_allocated_mb": 141.64453125,
390
- "cuda_reserved_mb": 12124.0,
391
- "cuda_max_allocated_mb": 11120.271484375,
392
- "cuda_max_reserved_mb": 12124.0,
393
- "gpu_util_percent": 67.0,
394
- "gpu_memory_util_percent": 49.0,
395
- "gpu_memory_used_mb": 13350.20703125,
396
  "gpu_memory_total_mb": 16303.0,
397
  "gpu_temperature_c": 61.0,
398
- "gpu_power_w": 196.523,
399
- "elapsed_seconds": 10.2389542000019
400
  },
401
  {
402
- "process_rss_mb": 3596.62890625,
403
- "cuda_allocated_mb": 126.2216796875,
404
  "cuda_reserved_mb": 12124.0,
405
  "cuda_max_allocated_mb": 11120.271484375,
406
  "cuda_max_reserved_mb": 12124.0,
407
- "gpu_util_percent": 100.0,
408
- "gpu_memory_util_percent": 77.0,
409
- "gpu_memory_used_mb": 13350.20703125,
410
  "gpu_memory_total_mb": 16303.0,
411
- "gpu_temperature_c": 63.0,
412
- "gpu_power_w": 220.472,
413
- "elapsed_seconds": 10.747453499992844
414
  },
415
  {
416
- "process_rss_mb": 3596.62890625,
417
  "cuda_allocated_mb": 141.64453125,
418
  "cuda_reserved_mb": 12124.0,
419
  "cuda_max_allocated_mb": 11120.271484375,
420
  "cuda_max_reserved_mb": 12124.0,
421
- "gpu_util_percent": 65.0,
422
  "gpu_memory_util_percent": 49.0,
423
- "gpu_memory_used_mb": 13350.20703125,
424
- "gpu_memory_total_mb": 16303.0,
425
- "gpu_temperature_c": 57.0,
426
- "gpu_power_w": 214.297,
427
- "elapsed_seconds": 11.256059200008167
428
- },
429
- {
430
- "process_rss_mb": 3596.62890625,
431
- "cuda_allocated_mb": 126.2216796875,
432
- "cuda_reserved_mb": 12124.0,
433
- "cuda_max_allocated_mb": 11120.271484375,
434
- "cuda_max_reserved_mb": 12124.0,
435
- "gpu_util_percent": 100.0,
436
- "gpu_memory_util_percent": 75.0,
437
- "gpu_memory_used_mb": 13350.20703125,
438
  "gpu_memory_total_mb": 16303.0,
439
- "gpu_temperature_c": 63.0,
440
- "gpu_power_w": 220.449,
441
- "elapsed_seconds": 11.772158899984788
442
  },
443
  {
444
- "process_rss_mb": 3596.62890625,
445
  "cuda_allocated_mb": 141.64453125,
446
  "cuda_reserved_mb": 12124.0,
447
  "cuda_max_allocated_mb": 11120.271484375,
448
  "cuda_max_reserved_mb": 12124.0,
449
- "gpu_util_percent": 67.0,
450
- "gpu_memory_util_percent": 52.0,
451
- "gpu_memory_used_mb": 13350.20703125,
452
  "gpu_memory_total_mb": 16303.0,
453
- "gpu_temperature_c": 60.0,
454
- "gpu_power_w": 215.206,
455
- "elapsed_seconds": 12.287043399992399
456
  },
457
  {
458
- "process_rss_mb": 3596.62890625,
459
- "cuda_allocated_mb": 126.2216796875,
460
  "cuda_reserved_mb": 12124.0,
461
  "cuda_max_allocated_mb": 11120.271484375,
462
  "cuda_max_reserved_mb": 12124.0,
463
- "gpu_util_percent": 87.0,
464
- "gpu_memory_util_percent": 65.0,
465
- "gpu_memory_used_mb": 13350.20703125,
466
  "gpu_memory_total_mb": 16303.0,
467
- "gpu_temperature_c": 63.0,
468
- "gpu_power_w": 218.221,
469
- "elapsed_seconds": 12.796272500010673
470
  },
471
  {
472
- "process_rss_mb": 3596.62890625,
473
  "cuda_allocated_mb": 141.64453125,
474
  "cuda_reserved_mb": 12124.0,
475
  "cuda_max_allocated_mb": 11120.271484375,
476
  "cuda_max_reserved_mb": 12124.0,
477
- "gpu_util_percent": 91.0,
478
- "gpu_memory_util_percent": 71.0,
479
- "gpu_memory_used_mb": 13350.20703125,
480
  "gpu_memory_total_mb": 16303.0,
481
- "gpu_temperature_c": 63.0,
482
- "gpu_power_w": 222.484,
483
- "elapsed_seconds": 13.301871300005587
484
  },
485
  {
486
- "process_rss_mb": 3596.62890625,
487
- "cuda_allocated_mb": 126.2216796875,
488
  "cuda_reserved_mb": 12124.0,
489
  "cuda_max_allocated_mb": 11120.271484375,
490
  "cuda_max_reserved_mb": 12124.0,
491
- "gpu_util_percent": 67.0,
492
- "gpu_memory_util_percent": 49.0,
493
- "gpu_memory_used_mb": 13350.20703125,
494
  "gpu_memory_total_mb": 16303.0,
495
- "gpu_temperature_c": 64.0,
496
- "gpu_power_w": 218.572,
497
- "elapsed_seconds": 13.88012049999088
498
  },
499
  {
500
- "process_rss_mb": 3596.62890625,
501
  "cuda_allocated_mb": 141.64453125,
502
  "cuda_reserved_mb": 12124.0,
503
  "cuda_max_allocated_mb": 11120.271484375,
504
  "cuda_max_reserved_mb": 12124.0,
505
- "gpu_util_percent": 85.0,
506
- "gpu_memory_util_percent": 63.0,
507
- "gpu_memory_used_mb": 13350.20703125,
508
  "gpu_memory_total_mb": 16303.0,
509
- "gpu_temperature_c": 61.0,
510
- "gpu_power_w": 179.005,
511
- "elapsed_seconds": 14.386946899991017
512
  },
513
  {
514
- "process_rss_mb": 3596.62890625,
515
  "cuda_allocated_mb": 126.2216796875,
516
  "cuda_reserved_mb": 12124.0,
517
  "cuda_max_allocated_mb": 11120.271484375,
518
  "cuda_max_reserved_mb": 12124.0,
519
- "gpu_util_percent": 93.0,
520
  "gpu_memory_util_percent": 73.0,
521
- "gpu_memory_used_mb": 13350.20703125,
522
  "gpu_memory_total_mb": 16303.0,
523
  "gpu_temperature_c": 63.0,
524
- "gpu_power_w": 178.29,
525
- "elapsed_seconds": 14.909591800009366
526
- },
527
- {
528
- "process_rss_mb": 3596.62890625,
529
- "cuda_allocated_mb": 141.64453125,
530
- "cuda_reserved_mb": 12124.0,
531
- "cuda_max_allocated_mb": 11120.271484375,
532
- "cuda_max_reserved_mb": 12124.0,
533
- "gpu_util_percent": 66.0,
534
- "gpu_memory_util_percent": 47.0,
535
- "gpu_memory_used_mb": 13350.20703125,
536
- "gpu_memory_total_mb": 16303.0,
537
- "gpu_temperature_c": 65.0,
538
- "gpu_power_w": 218.862,
539
- "elapsed_seconds": 15.418887899984838
540
  },
541
  {
542
- "process_rss_mb": 3596.62890625,
543
  "cuda_allocated_mb": 126.2216796875,
544
  "cuda_reserved_mb": 12124.0,
545
  "cuda_max_allocated_mb": 11120.271484375,
546
  "cuda_max_reserved_mb": 12124.0,
547
- "gpu_util_percent": 100.0,
548
- "gpu_memory_util_percent": 79.0,
549
- "gpu_memory_used_mb": 13350.20703125,
550
  "gpu_memory_total_mb": 16303.0,
551
  "gpu_temperature_c": 64.0,
552
- "gpu_power_w": 221.606,
553
- "elapsed_seconds": 15.933654699998442
554
  },
555
  {
556
- "process_rss_mb": 3596.62890625,
557
  "cuda_allocated_mb": 141.64453125,
558
  "cuda_reserved_mb": 12124.0,
559
  "cuda_max_allocated_mb": 11120.271484375,
560
  "cuda_max_reserved_mb": 12124.0,
561
- "gpu_util_percent": 67.0,
562
- "gpu_memory_util_percent": 49.0,
563
- "gpu_memory_used_mb": 13350.20703125,
564
  "gpu_memory_total_mb": 16303.0,
565
- "gpu_temperature_c": 63.0,
566
- "gpu_power_w": 218.428,
567
- "elapsed_seconds": 16.43813369999407
568
  },
569
  {
570
- "process_rss_mb": 3596.62890625,
571
  "cuda_allocated_mb": 126.2216796875,
572
  "cuda_reserved_mb": 12124.0,
573
  "cuda_max_allocated_mb": 11120.271484375,
574
  "cuda_max_reserved_mb": 12124.0,
575
  "gpu_util_percent": 100.0,
576
- "gpu_memory_util_percent": 76.0,
577
- "gpu_memory_used_mb": 13350.20703125,
578
  "gpu_memory_total_mb": 16303.0,
579
- "gpu_temperature_c": 65.0,
580
- "gpu_power_w": 221.864,
581
- "elapsed_seconds": 16.955908100004308
582
  },
583
  {
584
- "process_rss_mb": 3596.62890625,
585
  "cuda_allocated_mb": 141.64453125,
586
  "cuda_reserved_mb": 12124.0,
587
  "cuda_max_allocated_mb": 11120.271484375,
588
  "cuda_max_reserved_mb": 12124.0,
589
- "gpu_util_percent": 65.0,
590
  "gpu_memory_util_percent": 50.0,
591
- "gpu_memory_used_mb": 13350.20703125,
592
  "gpu_memory_total_mb": 16303.0,
593
  "gpu_temperature_c": 59.0,
594
- "gpu_power_w": 218.845,
595
- "elapsed_seconds": 17.458802300010575
596
  },
597
  {
598
- "process_rss_mb": 3596.62890625,
599
- "cuda_allocated_mb": 126.2216796875,
600
  "cuda_reserved_mb": 12124.0,
601
  "cuda_max_allocated_mb": 11120.271484375,
602
  "cuda_max_reserved_mb": 12124.0,
603
- "gpu_util_percent": 100.0,
604
- "gpu_memory_util_percent": 74.0,
605
- "gpu_memory_used_mb": 13350.20703125,
606
  "gpu_memory_total_mb": 16303.0,
607
- "gpu_temperature_c": 65.0,
608
- "gpu_power_w": 219.259,
609
- "elapsed_seconds": 17.98001659999136
610
  },
611
  {
612
- "process_rss_mb": 3596.62890625,
613
  "cuda_allocated_mb": 141.64453125,
614
  "cuda_reserved_mb": 12124.0,
615
  "cuda_max_allocated_mb": 11120.271484375,
616
  "cuda_max_reserved_mb": 12124.0,
617
- "gpu_util_percent": 71.0,
618
- "gpu_memory_util_percent": 55.0,
619
- "gpu_memory_used_mb": 13350.20703125,
620
  "gpu_memory_total_mb": 16303.0,
621
- "gpu_temperature_c": 62.0,
622
- "gpu_power_w": 217.625,
623
- "elapsed_seconds": 18.493281700008083
624
  },
625
  {
626
- "process_rss_mb": 3596.62890625,
627
  "cuda_allocated_mb": 141.64453125,
628
  "cuda_reserved_mb": 12124.0,
629
  "cuda_max_allocated_mb": 11120.271484375,
630
  "cuda_max_reserved_mb": 12124.0,
631
- "gpu_util_percent": 82.0,
632
- "gpu_memory_util_percent": 61.0,
633
- "gpu_memory_used_mb": 13350.20703125,
634
  "gpu_memory_total_mb": 16303.0,
635
  "gpu_temperature_c": 65.0,
636
- "gpu_power_w": 223.113,
637
- "elapsed_seconds": 19.00455859999056
638
  },
639
  {
640
- "process_rss_mb": 3596.62890625,
641
- "cuda_allocated_mb": 6600.91796875,
642
  "cuda_reserved_mb": 12124.0,
643
  "cuda_max_allocated_mb": 11120.271484375,
644
  "cuda_max_reserved_mb": 12124.0,
645
- "gpu_util_percent": 94.0,
646
  "gpu_memory_util_percent": 70.0,
647
- "gpu_memory_used_mb": 13350.20703125,
648
- "gpu_memory_total_mb": 16303.0,
649
- "gpu_temperature_c": 63.0,
650
- "gpu_power_w": 176.144,
651
- "elapsed_seconds": 19.51709879998816
652
- },
653
- {
654
- "process_rss_mb": 3596.62890625,
655
- "cuda_allocated_mb": 141.64453125,
656
- "cuda_reserved_mb": 12124.0,
657
- "cuda_max_allocated_mb": 11120.271484375,
658
- "cuda_max_reserved_mb": 12124.0,
659
- "gpu_util_percent": 74.0,
660
- "gpu_memory_util_percent": 55.0,
661
- "gpu_memory_used_mb": 13350.20703125,
662
- "gpu_memory_total_mb": 16303.0,
663
- "gpu_temperature_c": 66.0,
664
- "gpu_power_w": 221.989,
665
- "elapsed_seconds": 20.026390199986054
666
- },
667
- {
668
- "process_rss_mb": 3596.62890625,
669
- "cuda_allocated_mb": 8451.15673828125,
670
- "cuda_reserved_mb": 12124.0,
671
- "cuda_max_allocated_mb": 11120.271484375,
672
- "cuda_max_reserved_mb": 12124.0,
673
- "gpu_util_percent": 100.0,
674
- "gpu_memory_util_percent": 78.0,
675
- "gpu_memory_used_mb": 13350.20703125,
676
- "gpu_memory_total_mb": 16303.0,
677
- "gpu_temperature_c": 65.0,
678
- "gpu_power_w": 222.741,
679
- "elapsed_seconds": 20.537088000011863
680
- },
681
- {
682
- "process_rss_mb": 3596.62890625,
683
- "cuda_allocated_mb": 141.64453125,
684
- "cuda_reserved_mb": 12124.0,
685
- "cuda_max_allocated_mb": 11120.271484375,
686
- "cuda_max_reserved_mb": 12124.0,
687
- "gpu_util_percent": 64.0,
688
- "gpu_memory_util_percent": 46.0,
689
- "gpu_memory_used_mb": 13350.20703125,
690
- "gpu_memory_total_mb": 16303.0,
691
- "gpu_temperature_c": 65.0,
692
- "gpu_power_w": 218.985,
693
- "elapsed_seconds": 21.044752499990864
694
- },
695
- {
696
- "process_rss_mb": 3596.62890625,
697
- "cuda_allocated_mb": 5214.01220703125,
698
- "cuda_reserved_mb": 12124.0,
699
- "cuda_max_allocated_mb": 11120.271484375,
700
- "cuda_max_reserved_mb": 12124.0,
701
- "gpu_util_percent": 100.0,
702
- "gpu_memory_util_percent": 77.0,
703
- "gpu_memory_used_mb": 13350.20703125,
704
- "gpu_memory_total_mb": 16303.0,
705
- "gpu_temperature_c": 65.0,
706
- "gpu_power_w": 222.352,
707
- "elapsed_seconds": 21.555139400006738
708
- },
709
- {
710
- "process_rss_mb": 3596.62890625,
711
- "cuda_allocated_mb": 141.64453125,
712
- "cuda_reserved_mb": 12124.0,
713
- "cuda_max_allocated_mb": 11120.271484375,
714
- "cuda_max_reserved_mb": 12124.0,
715
- "gpu_util_percent": 65.0,
716
- "gpu_memory_util_percent": 48.0,
717
- "gpu_memory_used_mb": 13350.20703125,
718
- "gpu_memory_total_mb": 16303.0,
719
- "gpu_temperature_c": 60.0,
720
- "gpu_power_w": 216.032,
721
- "elapsed_seconds": 22.065228299994487
722
- },
723
- {
724
- "process_rss_mb": 3612.890625,
725
- "cuda_allocated_mb": 126.2216796875,
726
- "cuda_reserved_mb": 12124.0,
727
- "cuda_max_allocated_mb": 11120.271484375,
728
- "cuda_max_reserved_mb": 12124.0,
729
- "gpu_util_percent": 100.0,
730
- "gpu_memory_util_percent": 74.0,
731
- "gpu_memory_used_mb": 13350.20703125,
732
- "gpu_memory_total_mb": 16303.0,
733
- "gpu_temperature_c": 67.0,
734
- "gpu_power_w": 222.12,
735
- "elapsed_seconds": 22.576357499987353
736
- },
737
- {
738
- "process_rss_mb": 3612.890625,
739
- "cuda_allocated_mb": 141.64453125,
740
- "cuda_reserved_mb": 12124.0,
741
- "cuda_max_allocated_mb": 11120.271484375,
742
- "cuda_max_reserved_mb": 12124.0,
743
- "gpu_util_percent": 74.0,
744
- "gpu_memory_util_percent": 58.0,
745
- "gpu_memory_used_mb": 13350.20703125,
746
- "gpu_memory_total_mb": 16303.0,
747
- "gpu_temperature_c": 64.0,
748
- "gpu_power_w": 221.54,
749
- "elapsed_seconds": 23.087020799983293
750
- },
751
- {
752
- "process_rss_mb": 3612.890625,
753
- "cuda_allocated_mb": 126.2216796875,
754
- "cuda_reserved_mb": 12124.0,
755
- "cuda_max_allocated_mb": 11120.271484375,
756
- "cuda_max_reserved_mb": 12124.0,
757
- "gpu_util_percent": 85.0,
758
- "gpu_memory_util_percent": 64.0,
759
- "gpu_memory_used_mb": 13350.20703125,
760
- "gpu_memory_total_mb": 16303.0,
761
- "gpu_temperature_c": 66.0,
762
- "gpu_power_w": 221.585,
763
- "elapsed_seconds": 23.59532359999139
764
- },
765
- {
766
- "process_rss_mb": 3612.890625,
767
- "cuda_allocated_mb": 141.64453125,
768
- "cuda_reserved_mb": 12124.0,
769
- "cuda_max_allocated_mb": 11120.271484375,
770
- "cuda_max_reserved_mb": 12124.0,
771
- "gpu_util_percent": 1.0,
772
- "gpu_memory_util_percent": 2.0,
773
- "gpu_memory_used_mb": 13350.20703125,
774
- "gpu_memory_total_mb": 16303.0,
775
- "gpu_temperature_c": 56.0,
776
- "gpu_power_w": 193.905,
777
- "elapsed_seconds": 24.106666300009238
778
- },
779
- {
780
- "process_rss_mb": 3612.890625,
781
- "cuda_allocated_mb": 141.64453125,
782
- "cuda_reserved_mb": 12124.0,
783
- "cuda_max_allocated_mb": 11120.271484375,
784
- "cuda_max_reserved_mb": 12124.0,
785
- "gpu_util_percent": 67.0,
786
- "gpu_memory_util_percent": 53.0,
787
- "gpu_memory_used_mb": 13350.20703125,
788
- "gpu_memory_total_mb": 16303.0,
789
- "gpu_temperature_c": 63.0,
790
- "gpu_power_w": 180.115,
791
- "elapsed_seconds": 24.61678059998667
792
- },
793
- {
794
- "process_rss_mb": 3612.890625,
795
- "cuda_allocated_mb": 399.6669921875,
796
- "cuda_reserved_mb": 12124.0,
797
- "cuda_max_allocated_mb": 11120.271484375,
798
- "cuda_max_reserved_mb": 12124.0,
799
- "gpu_util_percent": 90.0,
800
- "gpu_memory_util_percent": 67.0,
801
- "gpu_memory_used_mb": 13350.20703125,
802
- "gpu_memory_total_mb": 16303.0,
803
- "gpu_temperature_c": 67.0,
804
- "gpu_power_w": 208.806,
805
- "elapsed_seconds": 25.127415800001472
806
- },
807
- {
808
- "process_rss_mb": 3612.890625,
809
- "cuda_allocated_mb": 141.64453125,
810
- "cuda_reserved_mb": 12124.0,
811
- "cuda_max_allocated_mb": 11120.271484375,
812
- "cuda_max_reserved_mb": 12124.0,
813
- "gpu_util_percent": 92.0,
814
- "gpu_memory_util_percent": 72.0,
815
- "gpu_memory_used_mb": 13350.20703125,
816
- "gpu_memory_total_mb": 16303.0,
817
- "gpu_temperature_c": 66.0,
818
- "gpu_power_w": 225.106,
819
- "elapsed_seconds": 25.63698320000549
820
- },
821
- {
822
- "process_rss_mb": 3612.890625,
823
- "cuda_allocated_mb": 2104.0634765625,
824
- "cuda_reserved_mb": 12124.0,
825
- "cuda_max_allocated_mb": 11120.271484375,
826
- "cuda_max_reserved_mb": 12124.0,
827
- "gpu_util_percent": 66.0,
828
- "gpu_memory_util_percent": 48.0,
829
- "gpu_memory_used_mb": 13350.20703125,
830
- "gpu_memory_total_mb": 16303.0,
831
- "gpu_temperature_c": 67.0,
832
- "gpu_power_w": 221.467,
833
- "elapsed_seconds": 26.145683699985966
834
- },
835
- {
836
- "process_rss_mb": 3612.890625,
837
- "cuda_allocated_mb": 141.64453125,
838
- "cuda_reserved_mb": 12124.0,
839
- "cuda_max_allocated_mb": 11120.271484375,
840
- "cuda_max_reserved_mb": 12124.0,
841
- "gpu_util_percent": 100.0,
842
- "gpu_memory_util_percent": 79.0,
843
- "gpu_memory_used_mb": 13350.20703125,
844
- "gpu_memory_total_mb": 16303.0,
845
- "gpu_temperature_c": 67.0,
846
- "gpu_power_w": 223.909,
847
- "elapsed_seconds": 26.656147300003795
848
- },
849
- {
850
- "process_rss_mb": 3612.890625,
851
- "cuda_allocated_mb": 5922.0439453125,
852
- "cuda_reserved_mb": 12124.0,
853
- "cuda_max_allocated_mb": 11120.271484375,
854
- "cuda_max_reserved_mb": 12124.0,
855
- "gpu_util_percent": 65.0,
856
- "gpu_memory_util_percent": 47.0,
857
- "gpu_memory_used_mb": 13350.20703125,
858
- "gpu_memory_total_mb": 16303.0,
859
- "gpu_temperature_c": 65.0,
860
- "gpu_power_w": 220.797,
861
- "elapsed_seconds": 27.16638400001102
862
- },
863
- {
864
- "process_rss_mb": 3612.890625,
865
- "cuda_allocated_mb": 141.64453125,
866
- "cuda_reserved_mb": 12124.0,
867
- "cuda_max_allocated_mb": 11120.271484375,
868
- "cuda_max_reserved_mb": 12124.0,
869
- "gpu_util_percent": 100.0,
870
- "gpu_memory_util_percent": 76.0,
871
- "gpu_memory_used_mb": 13350.20703125,
872
- "gpu_memory_total_mb": 16303.0,
873
- "gpu_temperature_c": 68.0,
874
- "gpu_power_w": 224.129,
875
- "elapsed_seconds": 27.677494300005492
876
- },
877
- {
878
- "process_rss_mb": 3612.890625,
879
- "cuda_allocated_mb": 5752.5478515625,
880
- "cuda_reserved_mb": 12124.0,
881
- "cuda_max_allocated_mb": 11120.271484375,
882
- "cuda_max_reserved_mb": 12124.0,
883
- "gpu_util_percent": 65.0,
884
- "gpu_memory_util_percent": 50.0,
885
- "gpu_memory_used_mb": 13350.20703125,
886
- "gpu_memory_total_mb": 16303.0,
887
- "gpu_temperature_c": 61.0,
888
- "gpu_power_w": 220.716,
889
- "elapsed_seconds": 28.18874740001047
890
- },
891
- {
892
- "process_rss_mb": 3612.890625,
893
- "cuda_allocated_mb": 141.64453125,
894
- "cuda_reserved_mb": 12124.0,
895
- "cuda_max_allocated_mb": 11120.271484375,
896
- "cuda_max_reserved_mb": 12124.0,
897
- "gpu_util_percent": 100.0,
898
- "gpu_memory_util_percent": 75.0,
899
- "gpu_memory_used_mb": 13350.20703125,
900
- "gpu_memory_total_mb": 16303.0,
901
- "gpu_temperature_c": 67.0,
902
- "gpu_power_w": 221.391,
903
- "elapsed_seconds": 28.699434799986193
904
- },
905
- {
906
- "process_rss_mb": 3612.890625,
907
- "cuda_allocated_mb": 10308.271484375,
908
- "cuda_reserved_mb": 12124.0,
909
- "cuda_max_allocated_mb": 11120.271484375,
910
- "cuda_max_reserved_mb": 12124.0,
911
- "gpu_util_percent": 70.0,
912
- "gpu_memory_util_percent": 54.0,
913
- "gpu_memory_used_mb": 13350.20703125,
914
- "gpu_memory_total_mb": 16303.0,
915
- "gpu_temperature_c": 65.0,
916
- "gpu_power_w": 219.822,
917
- "elapsed_seconds": 29.21012229999178
918
- },
919
- {
920
- "process_rss_mb": 3612.890625,
921
- "cuda_allocated_mb": 141.64453125,
922
- "cuda_reserved_mb": 12124.0,
923
- "cuda_max_allocated_mb": 11120.271484375,
924
- "cuda_max_reserved_mb": 12124.0,
925
- "gpu_util_percent": 0.0,
926
- "gpu_memory_util_percent": 2.0,
927
- "gpu_memory_used_mb": 13350.20703125,
928
- "gpu_memory_total_mb": 16303.0,
929
- "gpu_temperature_c": 59.0,
930
- "gpu_power_w": 210.546,
931
- "elapsed_seconds": 29.718780900002457
932
- },
933
- {
934
- "process_rss_mb": 3612.890625,
935
- "cuda_allocated_mb": 141.64453125,
936
- "cuda_reserved_mb": 12124.0,
937
- "cuda_max_allocated_mb": 11120.271484375,
938
- "cuda_max_reserved_mb": 12124.0,
939
- "gpu_util_percent": 97.0,
940
- "gpu_memory_util_percent": 72.0,
941
- "gpu_memory_used_mb": 13350.20703125,
942
- "gpu_memory_total_mb": 16303.0,
943
- "gpu_temperature_c": 66.0,
944
- "gpu_power_w": 178.409,
945
- "elapsed_seconds": 30.229401099990355
946
- },
947
- {
948
- "process_rss_mb": 3612.890625,
949
- "cuda_allocated_mb": 141.64453125,
950
- "cuda_reserved_mb": 12124.0,
951
- "cuda_max_allocated_mb": 11120.271484375,
952
- "cuda_max_reserved_mb": 12124.0,
953
- "gpu_util_percent": 82.0,
954
- "gpu_memory_util_percent": 64.0,
955
- "gpu_memory_used_mb": 13350.20703125,
956
- "gpu_memory_total_mb": 16303.0,
957
- "gpu_temperature_c": 68.0,
958
- "gpu_power_w": 192.887,
959
- "elapsed_seconds": 30.739007500000298
960
- },
961
- {
962
- "process_rss_mb": 3612.890625,
963
- "cuda_allocated_mb": 141.64453125,
964
- "cuda_reserved_mb": 12124.0,
965
- "cuda_max_allocated_mb": 11120.271484375,
966
- "cuda_max_reserved_mb": 12124.0,
967
- "gpu_util_percent": 73.0,
968
- "gpu_memory_util_percent": 54.0,
969
- "gpu_memory_used_mb": 13350.20703125,
970
- "gpu_memory_total_mb": 16303.0,
971
- "gpu_temperature_c": 68.0,
972
- "gpu_power_w": 223.767,
973
- "elapsed_seconds": 31.248938000004273
974
- },
975
- {
976
- "process_rss_mb": 3612.890625,
977
- "cuda_allocated_mb": 141.64453125,
978
- "cuda_reserved_mb": 12124.0,
979
- "cuda_max_allocated_mb": 11120.271484375,
980
- "cuda_max_reserved_mb": 12124.0,
981
- "gpu_util_percent": 100.0,
982
- "gpu_memory_util_percent": 78.0,
983
- "gpu_memory_used_mb": 13350.20703125,
984
- "gpu_memory_total_mb": 16303.0,
985
- "gpu_temperature_c": 68.0,
986
- "gpu_power_w": 224.212,
987
- "elapsed_seconds": 31.757667500001844
988
- },
989
- {
990
- "process_rss_mb": 3612.890625,
991
- "cuda_allocated_mb": 141.64453125,
992
- "cuda_reserved_mb": 12124.0,
993
- "cuda_max_allocated_mb": 11120.271484375,
994
- "cuda_max_reserved_mb": 12124.0,
995
- "gpu_util_percent": 64.0,
996
- "gpu_memory_util_percent": 46.0,
997
- "gpu_memory_used_mb": 13350.20703125,
998
- "gpu_memory_total_mb": 16303.0,
999
- "gpu_temperature_c": 67.0,
1000
- "gpu_power_w": 221.526,
1001
- "elapsed_seconds": 32.26673880001181
1002
- },
1003
- {
1004
- "process_rss_mb": 3612.890625,
1005
- "cuda_allocated_mb": 640.3212890625,
1006
- "cuda_reserved_mb": 12124.0,
1007
- "cuda_max_allocated_mb": 11120.271484375,
1008
- "cuda_max_reserved_mb": 12124.0,
1009
- "gpu_util_percent": 100.0,
1010
- "gpu_memory_util_percent": 77.0,
1011
- "gpu_memory_used_mb": 13350.20703125,
1012
- "gpu_memory_total_mb": 16303.0,
1013
- "gpu_temperature_c": 68.0,
1014
- "gpu_power_w": 224.4,
1015
- "elapsed_seconds": 32.7766130999953
1016
- },
1017
- {
1018
- "process_rss_mb": 3612.890625,
1019
- "cuda_allocated_mb": 141.64453125,
1020
- "cuda_reserved_mb": 12124.0,
1021
- "cuda_max_allocated_mb": 11120.271484375,
1022
- "cuda_max_reserved_mb": 12124.0,
1023
- "gpu_util_percent": 62.0,
1024
- "gpu_memory_util_percent": 47.0,
1025
- "gpu_memory_used_mb": 13350.20703125,
1026
  "gpu_memory_total_mb": 16303.0,
1027
  "gpu_temperature_c": 62.0,
1028
- "gpu_power_w": 217.718,
1029
- "elapsed_seconds": 33.28791949999868
1030
- },
1031
- {
1032
- "process_rss_mb": 3612.890625,
1033
- "cuda_allocated_mb": 3393.1826171875,
1034
- "cuda_reserved_mb": 12124.0,
1035
- "cuda_max_allocated_mb": 11120.271484375,
1036
- "cuda_max_reserved_mb": 12124.0,
1037
- "gpu_util_percent": 100.0,
1038
- "gpu_memory_util_percent": 75.0,
1039
- "gpu_memory_used_mb": 13350.20703125,
1040
- "gpu_memory_total_mb": 16303.0,
1041
- "gpu_temperature_c": 67.0,
1042
- "gpu_power_w": 221.852,
1043
- "elapsed_seconds": 33.79941800000961
1044
- },
1045
- {
1046
- "process_rss_mb": 3612.93359375,
1047
- "cuda_allocated_mb": 141.64453125,
1048
- "cuda_reserved_mb": 12124.0,
1049
- "cuda_max_allocated_mb": 11120.271484375,
1050
- "cuda_max_reserved_mb": 12124.0,
1051
- "gpu_util_percent": 67.0,
1052
- "gpu_memory_util_percent": 51.0,
1053
- "gpu_memory_used_mb": 13350.20703125,
1054
- "gpu_memory_total_mb": 16303.0,
1055
- "gpu_temperature_c": 65.0,
1056
- "gpu_power_w": 223.379,
1057
- "elapsed_seconds": 34.30993029999081
1058
- },
1059
- {
1060
- "process_rss_mb": 3612.94140625,
1061
- "cuda_allocated_mb": 3445.80859375,
1062
- "cuda_reserved_mb": 12124.0,
1063
- "cuda_max_allocated_mb": 11120.271484375,
1064
- "cuda_max_reserved_mb": 12124.0,
1065
- "gpu_util_percent": 91.0,
1066
- "gpu_memory_util_percent": 68.0,
1067
- "gpu_memory_used_mb": 13350.20703125,
1068
- "gpu_memory_total_mb": 16303.0,
1069
- "gpu_temperature_c": 69.0,
1070
- "gpu_power_w": 222.715,
1071
- "elapsed_seconds": 34.820057700009784
1072
- },
1073
- {
1074
- "process_rss_mb": 3651.80078125,
1075
- "cuda_allocated_mb": 8961.90380859375,
1076
- "cuda_reserved_mb": 12124.0,
1077
- "cuda_max_allocated_mb": 11120.271484375,
1078
- "cuda_max_reserved_mb": 12124.0,
1079
- "gpu_util_percent": 87.0,
1080
- "gpu_memory_util_percent": 68.0,
1081
- "gpu_memory_used_mb": 13350.20703125,
1082
- "gpu_memory_total_mb": 16303.0,
1083
- "gpu_temperature_c": 69.0,
1084
- "gpu_power_w": 226.682,
1085
- "elapsed_seconds": 35.32873710000422
1086
- },
1087
- {
1088
- "process_rss_mb": 3651.80078125,
1089
- "cuda_allocated_mb": 126.2216796875,
1090
- "cuda_reserved_mb": 12124.0,
1091
- "cuda_max_allocated_mb": 11120.271484375,
1092
- "cuda_max_reserved_mb": 12124.0,
1093
- "gpu_util_percent": 100.0,
1094
- "gpu_memory_util_percent": 78.0,
1095
- "gpu_memory_used_mb": 13350.20703125,
1096
- "gpu_memory_total_mb": 16303.0,
1097
- "gpu_temperature_c": 67.0,
1098
- "gpu_power_w": 217.093,
1099
- "elapsed_seconds": 35.98900939998566
1100
- },
1101
- {
1102
- "process_rss_mb": 3651.80078125,
1103
- "cuda_allocated_mb": 140.751953125,
1104
- "cuda_reserved_mb": 12124.0,
1105
- "cuda_max_allocated_mb": 11120.271484375,
1106
- "cuda_max_reserved_mb": 12124.0,
1107
- "gpu_util_percent": 1.0,
1108
- "gpu_memory_util_percent": 2.0,
1109
- "gpu_memory_used_mb": 13350.20703125,
1110
- "gpu_memory_total_mb": 16303.0,
1111
- "gpu_temperature_c": 58.0,
1112
- "gpu_power_w": 173.996,
1113
- "elapsed_seconds": 36.49037069999031
1114
- },
1115
- {
1116
- "process_rss_mb": 3651.80078125,
1117
- "cuda_allocated_mb": 126.2216796875,
1118
- "cuda_reserved_mb": 12124.0,
1119
- "cuda_max_allocated_mb": 11120.271484375,
1120
- "cuda_max_reserved_mb": 12124.0,
1121
- "gpu_util_percent": 75.0,
1122
- "gpu_memory_util_percent": 56.0,
1123
- "gpu_memory_used_mb": 13350.20703125,
1124
- "gpu_memory_total_mb": 16303.0,
1125
- "gpu_temperature_c": 69.0,
1126
- "gpu_power_w": 176.62,
1127
- "elapsed_seconds": 37.00997529999586
1128
- },
1129
- {
1130
- "process_rss_mb": 3651.80078125,
1131
- "cuda_allocated_mb": 140.751953125,
1132
- "cuda_reserved_mb": 12124.0,
1133
- "cuda_max_allocated_mb": 11120.271484375,
1134
- "cuda_max_reserved_mb": 12124.0,
1135
- "gpu_util_percent": 100.0,
1136
- "gpu_memory_util_percent": 77.0,
1137
- "gpu_memory_used_mb": 13350.20703125,
1138
- "gpu_memory_total_mb": 16303.0,
1139
- "gpu_temperature_c": 70.0,
1140
- "gpu_power_w": 225.539,
1141
- "elapsed_seconds": 37.51092579998658
1142
- },
1143
- {
1144
- "process_rss_mb": 3651.80078125,
1145
- "cuda_allocated_mb": 126.2216796875,
1146
- "cuda_reserved_mb": 12124.0,
1147
- "cuda_max_allocated_mb": 11120.271484375,
1148
- "cuda_max_reserved_mb": 12124.0,
1149
- "gpu_util_percent": 100.0,
1150
- "gpu_memory_util_percent": 77.0,
1151
- "gpu_memory_used_mb": 13350.20703125,
1152
- "gpu_memory_total_mb": 16303.0,
1153
- "gpu_temperature_c": 70.0,
1154
- "gpu_power_w": 224.74,
1155
- "elapsed_seconds": 38.03694089999772
1156
- },
1157
- {
1158
- "process_rss_mb": 3651.80078125,
1159
- "cuda_allocated_mb": 140.751953125,
1160
- "cuda_reserved_mb": 12124.0,
1161
- "cuda_max_allocated_mb": 11120.271484375,
1162
- "cuda_max_reserved_mb": 12124.0,
1163
- "gpu_util_percent": 66.0,
1164
- "gpu_memory_util_percent": 50.0,
1165
- "gpu_memory_used_mb": 13350.20703125,
1166
- "gpu_memory_total_mb": 16303.0,
1167
- "gpu_temperature_c": 63.0,
1168
- "gpu_power_w": 218.941,
1169
- "elapsed_seconds": 38.545965199999046
1170
- },
1171
- {
1172
- "process_rss_mb": 3651.80078125,
1173
- "cuda_allocated_mb": 126.2216796875,
1174
- "cuda_reserved_mb": 12124.0,
1175
- "cuda_max_allocated_mb": 11120.271484375,
1176
- "cuda_max_reserved_mb": 12124.0,
1177
- "gpu_util_percent": 100.0,
1178
- "gpu_memory_util_percent": 75.0,
1179
- "gpu_memory_used_mb": 13350.20703125,
1180
- "gpu_memory_total_mb": 16303.0,
1181
- "gpu_temperature_c": 69.0,
1182
- "gpu_power_w": 225.898,
1183
- "elapsed_seconds": 39.06172140000854
1184
- },
1185
- {
1186
- "process_rss_mb": 3651.80078125,
1187
- "cuda_allocated_mb": 140.751953125,
1188
- "cuda_reserved_mb": 12124.0,
1189
- "cuda_max_allocated_mb": 11120.271484375,
1190
- "cuda_max_reserved_mb": 12124.0,
1191
- "gpu_util_percent": 64.0,
1192
- "gpu_memory_util_percent": 51.0,
1193
- "gpu_memory_used_mb": 13350.20703125,
1194
- "gpu_memory_total_mb": 16303.0,
1195
- "gpu_temperature_c": 65.0,
1196
- "gpu_power_w": 223.082,
1197
- "elapsed_seconds": 39.56581679999363
1198
- },
1199
- {
1200
- "process_rss_mb": 3651.80078125,
1201
- "cuda_allocated_mb": 126.2216796875,
1202
- "cuda_reserved_mb": 12124.0,
1203
- "cuda_max_allocated_mb": 11120.271484375,
1204
- "cuda_max_reserved_mb": 12124.0,
1205
- "gpu_util_percent": 93.0,
1206
- "gpu_memory_util_percent": 69.0,
1207
- "gpu_memory_used_mb": 13350.20703125,
1208
- "gpu_memory_total_mb": 16303.0,
1209
- "gpu_temperature_c": 70.0,
1210
- "gpu_power_w": 222.575,
1211
- "elapsed_seconds": 40.08639479998965
1212
- },
1213
- {
1214
- "process_rss_mb": 3651.80078125,
1215
- "cuda_allocated_mb": 140.751953125,
1216
- "cuda_reserved_mb": 12124.0,
1217
- "cuda_max_allocated_mb": 11120.271484375,
1218
- "cuda_max_reserved_mb": 12124.0,
1219
- "gpu_util_percent": 87.0,
1220
- "gpu_memory_util_percent": 68.0,
1221
- "gpu_memory_used_mb": 13350.20703125,
1222
- "gpu_memory_total_mb": 16303.0,
1223
- "gpu_temperature_c": 70.0,
1224
- "gpu_power_w": 223.698,
1225
- "elapsed_seconds": 40.6014307999867
1226
- },
1227
- {
1228
- "process_rss_mb": 3651.80078125,
1229
- "cuda_allocated_mb": 126.2216796875,
1230
- "cuda_reserved_mb": 12124.0,
1231
- "cuda_max_allocated_mb": 11120.271484375,
1232
- "cuda_max_reserved_mb": 12124.0,
1233
- "gpu_util_percent": 69.0,
1234
- "gpu_memory_util_percent": 51.0,
1235
- "gpu_memory_used_mb": 13350.20703125,
1236
- "gpu_memory_total_mb": 16303.0,
1237
- "gpu_temperature_c": 70.0,
1238
- "gpu_power_w": 224.483,
1239
- "elapsed_seconds": 41.112792400002945
1240
- },
1241
- {
1242
- "process_rss_mb": 3651.80078125,
1243
- "cuda_allocated_mb": 140.751953125,
1244
- "cuda_reserved_mb": 12124.0,
1245
- "cuda_max_allocated_mb": 11120.271484375,
1246
- "cuda_max_reserved_mb": 12124.0,
1247
- "gpu_util_percent": 100.0,
1248
- "gpu_memory_util_percent": 77.0,
1249
- "gpu_memory_used_mb": 13350.20703125,
1250
- "gpu_memory_total_mb": 16303.0,
1251
- "gpu_temperature_c": 71.0,
1252
- "gpu_power_w": 224.626,
1253
- "elapsed_seconds": 41.62216349999653
1254
- },
1255
- {
1256
- "process_rss_mb": 3651.80078125,
1257
- "cuda_allocated_mb": 126.2216796875,
1258
- "cuda_reserved_mb": 12124.0,
1259
- "cuda_max_allocated_mb": 11120.271484375,
1260
- "cuda_max_reserved_mb": 12124.0,
1261
- "gpu_util_percent": 64.0,
1262
- "gpu_memory_util_percent": 48.0,
1263
- "gpu_memory_used_mb": 13350.20703125,
1264
- "gpu_memory_total_mb": 16303.0,
1265
- "gpu_temperature_c": 68.0,
1266
- "gpu_power_w": 222.202,
1267
- "elapsed_seconds": 42.135246999998344
1268
- },
1269
- {
1270
- "process_rss_mb": 3651.80078125,
1271
- "cuda_allocated_mb": 140.751953125,
1272
- "cuda_reserved_mb": 12124.0,
1273
- "cuda_max_allocated_mb": 11120.271484375,
1274
- "cuda_max_reserved_mb": 12124.0,
1275
- "gpu_util_percent": 100.0,
1276
- "gpu_memory_util_percent": 76.0,
1277
- "gpu_memory_used_mb": 13350.20703125,
1278
- "gpu_memory_total_mb": 16303.0,
1279
- "gpu_temperature_c": 70.0,
1280
- "gpu_power_w": 222.999,
1281
- "elapsed_seconds": 42.63937769999029
1282
- },
1283
- {
1284
- "process_rss_mb": 3651.80078125,
1285
- "cuda_allocated_mb": 126.2216796875,
1286
- "cuda_reserved_mb": 12124.0,
1287
- "cuda_max_allocated_mb": 11120.271484375,
1288
- "cuda_max_reserved_mb": 12124.0,
1289
- "gpu_util_percent": 64.0,
1290
- "gpu_memory_util_percent": 48.0,
1291
- "gpu_memory_used_mb": 13350.20703125,
1292
- "gpu_memory_total_mb": 16303.0,
1293
- "gpu_temperature_c": 64.0,
1294
- "gpu_power_w": 219.672,
1295
- "elapsed_seconds": 43.1591141000099
1296
- },
1297
- {
1298
- "process_rss_mb": 3651.80078125,
1299
- "cuda_allocated_mb": 140.751953125,
1300
- "cuda_reserved_mb": 12124.0,
1301
- "cuda_max_allocated_mb": 11120.271484375,
1302
- "cuda_max_reserved_mb": 12124.0,
1303
- "gpu_util_percent": 100.0,
1304
- "gpu_memory_util_percent": 74.0,
1305
- "gpu_memory_used_mb": 13350.20703125,
1306
- "gpu_memory_total_mb": 16303.0,
1307
- "gpu_temperature_c": 70.0,
1308
- "gpu_power_w": 226.506,
1309
- "elapsed_seconds": 43.67472969999653
1310
- },
1311
- {
1312
- "process_rss_mb": 3651.80078125,
1313
- "cuda_allocated_mb": 126.2216796875,
1314
- "cuda_reserved_mb": 12124.0,
1315
- "cuda_max_allocated_mb": 11120.271484375,
1316
- "cuda_max_reserved_mb": 12124.0,
1317
- "gpu_util_percent": 68.0,
1318
- "gpu_memory_util_percent": 53.0,
1319
- "gpu_memory_used_mb": 13350.20703125,
1320
- "gpu_memory_total_mb": 16303.0,
1321
- "gpu_temperature_c": 67.0,
1322
- "gpu_power_w": 224.814,
1323
- "elapsed_seconds": 44.185869899985846
1324
- },
1325
- {
1326
- "process_rss_mb": 3651.80078125,
1327
- "cuda_allocated_mb": 140.751953125,
1328
- "cuda_reserved_mb": 12124.0,
1329
- "cuda_max_allocated_mb": 11120.271484375,
1330
- "cuda_max_reserved_mb": 12124.0,
1331
- "gpu_util_percent": 86.0,
1332
- "gpu_memory_util_percent": 64.0,
1333
- "gpu_memory_used_mb": 13350.20703125,
1334
- "gpu_memory_total_mb": 16303.0,
1335
- "gpu_temperature_c": 69.0,
1336
- "gpu_power_w": 223.741,
1337
- "elapsed_seconds": 44.69514809999964
1338
- },
1339
- {
1340
- "process_rss_mb": 3651.80078125,
1341
- "cuda_allocated_mb": 126.2216796875,
1342
- "cuda_reserved_mb": 12124.0,
1343
- "cuda_max_allocated_mb": 11120.271484375,
1344
- "cuda_max_reserved_mb": 12124.0,
1345
- "gpu_util_percent": 91.0,
1346
- "gpu_memory_util_percent": 71.0,
1347
- "gpu_memory_used_mb": 13350.20703125,
1348
- "gpu_memory_total_mb": 16303.0,
1349
- "gpu_temperature_c": 70.0,
1350
- "gpu_power_w": 225.243,
1351
- "elapsed_seconds": 45.20475219999207
1352
- },
1353
- {
1354
- "process_rss_mb": 3651.80078125,
1355
- "cuda_allocated_mb": 140.751953125,
1356
- "cuda_reserved_mb": 12124.0,
1357
- "cuda_max_allocated_mb": 11120.271484375,
1358
- "cuda_max_reserved_mb": 12124.0,
1359
- "gpu_util_percent": 65.0,
1360
- "gpu_memory_util_percent": 48.0,
1361
- "gpu_memory_used_mb": 13350.20703125,
1362
- "gpu_memory_total_mb": 16303.0,
1363
- "gpu_temperature_c": 70.0,
1364
- "gpu_power_w": 224.265,
1365
- "elapsed_seconds": 45.714895399985835
1366
- },
1367
- {
1368
- "process_rss_mb": 3651.80078125,
1369
- "cuda_allocated_mb": 140.751953125,
1370
- "cuda_reserved_mb": 12124.0,
1371
- "cuda_max_allocated_mb": 11120.271484375,
1372
- "cuda_max_reserved_mb": 12124.0,
1373
- "gpu_util_percent": 1.0,
1374
- "gpu_memory_util_percent": 2.0,
1375
- "gpu_memory_used_mb": 13350.20703125,
1376
- "gpu_memory_total_mb": 16303.0,
1377
- "gpu_temperature_c": 60.0,
1378
- "gpu_power_w": 199.361,
1379
- "elapsed_seconds": 46.226491599984
1380
- },
1381
- {
1382
- "process_rss_mb": 3651.80078125,
1383
- "cuda_allocated_mb": 140.751953125,
1384
- "cuda_reserved_mb": 12124.0,
1385
- "cuda_max_allocated_mb": 11120.271484375,
1386
- "cuda_max_reserved_mb": 12124.0,
1387
- "gpu_util_percent": 82.0,
1388
- "gpu_memory_util_percent": 64.0,
1389
- "gpu_memory_used_mb": 13350.20703125,
1390
- "gpu_memory_total_mb": 16303.0,
1391
- "gpu_temperature_c": 70.0,
1392
- "gpu_power_w": 183.775,
1393
- "elapsed_seconds": 46.73763419999159
1394
- },
1395
- {
1396
- "process_rss_mb": 3651.80078125,
1397
- "cuda_allocated_mb": 140.751953125,
1398
- "cuda_reserved_mb": 12124.0,
1399
- "cuda_max_allocated_mb": 11120.271484375,
1400
- "cuda_max_reserved_mb": 12124.0,
1401
- "gpu_util_percent": 75.0,
1402
- "gpu_memory_util_percent": 55.0,
1403
- "gpu_memory_used_mb": 13350.20703125,
1404
- "gpu_memory_total_mb": 16303.0,
1405
- "gpu_temperature_c": 71.0,
1406
- "gpu_power_w": 209.171,
1407
- "elapsed_seconds": 47.24767229999998
1408
- },
1409
- {
1410
- "process_rss_mb": 3651.80078125,
1411
- "cuda_allocated_mb": 140.751953125,
1412
- "cuda_reserved_mb": 12124.0,
1413
- "cuda_max_allocated_mb": 11120.271484375,
1414
- "cuda_max_reserved_mb": 12124.0,
1415
- "gpu_util_percent": 100.0,
1416
- "gpu_memory_util_percent": 78.0,
1417
- "gpu_memory_used_mb": 13350.20703125,
1418
- "gpu_memory_total_mb": 16303.0,
1419
- "gpu_temperature_c": 69.0,
1420
- "gpu_power_w": 226.806,
1421
- "elapsed_seconds": 47.7600390999869
1422
- },
1423
- {
1424
- "process_rss_mb": 3651.80078125,
1425
- "cuda_allocated_mb": 140.751953125,
1426
- "cuda_reserved_mb": 12124.0,
1427
- "cuda_max_allocated_mb": 11120.271484375,
1428
- "cuda_max_reserved_mb": 12124.0,
1429
- "gpu_util_percent": 66.0,
1430
- "gpu_memory_util_percent": 49.0,
1431
- "gpu_memory_used_mb": 13350.20703125,
1432
- "gpu_memory_total_mb": 16303.0,
1433
- "gpu_temperature_c": 71.0,
1434
- "gpu_power_w": 223.771,
1435
- "elapsed_seconds": 48.268198999983724
1436
- },
1437
- {
1438
- "process_rss_mb": 3651.80078125,
1439
- "cuda_allocated_mb": 140.751953125,
1440
- "cuda_reserved_mb": 12124.0,
1441
- "cuda_max_allocated_mb": 11120.271484375,
1442
- "cuda_max_reserved_mb": 12124.0,
1443
- "gpu_util_percent": 100.0,
1444
- "gpu_memory_util_percent": 77.0,
1445
- "gpu_memory_used_mb": 13350.20703125,
1446
- "gpu_memory_total_mb": 16303.0,
1447
- "gpu_temperature_c": 71.0,
1448
- "gpu_power_w": 223.924,
1449
- "elapsed_seconds": 48.77904500000295
1450
- },
1451
- {
1452
- "process_rss_mb": 3651.80078125,
1453
- "cuda_allocated_mb": 140.751953125,
1454
- "cuda_reserved_mb": 12124.0,
1455
- "cuda_max_allocated_mb": 11120.271484375,
1456
- "cuda_max_reserved_mb": 12124.0,
1457
- "gpu_util_percent": 66.0,
1458
- "gpu_memory_util_percent": 50.0,
1459
- "gpu_memory_used_mb": 13350.20703125,
1460
- "gpu_memory_total_mb": 16303.0,
1461
- "gpu_temperature_c": 66.0,
1462
- "gpu_power_w": 223.157,
1463
- "elapsed_seconds": 49.28964350000024
1464
- },
1465
- {
1466
- "process_rss_mb": 3651.80078125,
1467
- "cuda_allocated_mb": 140.751953125,
1468
- "cuda_reserved_mb": 12124.0,
1469
- "cuda_max_allocated_mb": 11120.271484375,
1470
- "cuda_max_reserved_mb": 12124.0,
1471
- "gpu_util_percent": 100.0,
1472
- "gpu_memory_util_percent": 76.0,
1473
- "gpu_memory_used_mb": 13350.20703125,
1474
- "gpu_memory_total_mb": 16303.0,
1475
- "gpu_temperature_c": 72.0,
1476
- "gpu_power_w": 227.113,
1477
- "elapsed_seconds": 49.799869399983436
1478
- },
1479
- {
1480
- "process_rss_mb": 3651.80078125,
1481
- "cuda_allocated_mb": 140.751953125,
1482
- "cuda_reserved_mb": 12124.0,
1483
- "cuda_max_allocated_mb": 11120.271484375,
1484
- "cuda_max_reserved_mb": 12124.0,
1485
- "gpu_util_percent": 65.0,
1486
- "gpu_memory_util_percent": 52.0,
1487
- "gpu_memory_used_mb": 13350.20703125,
1488
- "gpu_memory_total_mb": 16303.0,
1489
- "gpu_temperature_c": 66.0,
1490
- "gpu_power_w": 223.839,
1491
- "elapsed_seconds": 50.30973179999273
1492
- },
1493
- {
1494
- "process_rss_mb": 3651.80078125,
1495
- "cuda_allocated_mb": 584.3232421875,
1496
- "cuda_reserved_mb": 12124.0,
1497
- "cuda_max_allocated_mb": 11120.271484375,
1498
- "cuda_max_reserved_mb": 12124.0,
1499
- "gpu_util_percent": 96.0,
1500
- "gpu_memory_util_percent": 72.0,
1501
- "gpu_memory_used_mb": 13350.20703125,
1502
- "gpu_memory_total_mb": 16303.0,
1503
- "gpu_temperature_c": 71.0,
1504
- "gpu_power_w": 227.682,
1505
- "elapsed_seconds": 50.81797879998339
1506
- },
1507
- {
1508
- "process_rss_mb": 3651.80078125,
1509
- "cuda_allocated_mb": 140.751953125,
1510
- "cuda_reserved_mb": 12124.0,
1511
- "cuda_max_allocated_mb": 11120.271484375,
1512
- "cuda_max_reserved_mb": 12124.0,
1513
- "gpu_util_percent": 83.0,
1514
- "gpu_memory_util_percent": 64.0,
1515
- "gpu_memory_used_mb": 13350.20703125,
1516
- "gpu_memory_total_mb": 16303.0,
1517
- "gpu_temperature_c": 71.0,
1518
- "gpu_power_w": 224.619,
1519
- "elapsed_seconds": 51.32866830000421
1520
- },
1521
- {
1522
- "process_rss_mb": 3651.80078125,
1523
- "cuda_allocated_mb": 367.50244140625,
1524
- "cuda_reserved_mb": 12124.0,
1525
- "cuda_max_allocated_mb": 11120.271484375,
1526
- "cuda_max_reserved_mb": 12124.0,
1527
- "gpu_util_percent": 75.0,
1528
- "gpu_memory_util_percent": 56.0,
1529
- "gpu_memory_used_mb": 13350.20703125,
1530
- "gpu_memory_total_mb": 16303.0,
1531
- "gpu_temperature_c": 72.0,
1532
- "gpu_power_w": 225.161,
1533
- "elapsed_seconds": 51.840583999990486
1534
- },
1535
- {
1536
- "process_rss_mb": 3651.80078125,
1537
- "cuda_allocated_mb": 140.751953125,
1538
- "cuda_reserved_mb": 12124.0,
1539
- "cuda_max_allocated_mb": 11120.271484375,
1540
- "cuda_max_reserved_mb": 12124.0,
1541
- "gpu_util_percent": 100.0,
1542
- "gpu_memory_util_percent": 78.0,
1543
- "gpu_memory_used_mb": 13350.20703125,
1544
- "gpu_memory_total_mb": 16303.0,
1545
- "gpu_temperature_c": 70.0,
1546
- "gpu_power_w": 227.579,
1547
- "elapsed_seconds": 52.35056530000293
1548
- },
1549
- {
1550
- "process_rss_mb": 3651.80078125,
1551
- "cuda_allocated_mb": 1088.0693359375,
1552
- "cuda_reserved_mb": 12124.0,
1553
- "cuda_max_allocated_mb": 11120.271484375,
1554
- "cuda_max_reserved_mb": 12124.0,
1555
- "gpu_util_percent": 64.0,
1556
- "gpu_memory_util_percent": 47.0,
1557
- "gpu_memory_used_mb": 13350.20703125,
1558
- "gpu_memory_total_mb": 16303.0,
1559
- "gpu_temperature_c": 71.0,
1560
- "gpu_power_w": 222.789,
1561
- "elapsed_seconds": 52.860525099997176
1562
- },
1563
- {
1564
- "process_rss_mb": 3651.80078125,
1565
- "cuda_allocated_mb": 140.751953125,
1566
- "cuda_reserved_mb": 12124.0,
1567
- "cuda_max_allocated_mb": 11120.271484375,
1568
- "cuda_max_reserved_mb": 12124.0,
1569
- "gpu_util_percent": 100.0,
1570
- "gpu_memory_util_percent": 76.0,
1571
- "gpu_memory_used_mb": 13350.20703125,
1572
- "gpu_memory_total_mb": 16303.0,
1573
- "gpu_temperature_c": 71.0,
1574
- "gpu_power_w": 227.236,
1575
- "elapsed_seconds": 53.37119559998973
1576
- },
1577
- {
1578
- "process_rss_mb": 3651.80078125,
1579
- "cuda_allocated_mb": 3840.9306640625,
1580
- "cuda_reserved_mb": 12124.0,
1581
- "cuda_max_allocated_mb": 11120.271484375,
1582
- "cuda_max_reserved_mb": 12124.0,
1583
- "gpu_util_percent": 63.0,
1584
- "gpu_memory_util_percent": 49.0,
1585
- "gpu_memory_used_mb": 13350.20703125,
1586
- "gpu_memory_total_mb": 16303.0,
1587
- "gpu_temperature_c": 66.0,
1588
- "gpu_power_w": 221.349,
1589
- "elapsed_seconds": 53.881874099985
1590
- },
1591
- {
1592
- "process_rss_mb": 3651.80078125,
1593
- "cuda_allocated_mb": 140.751953125,
1594
- "cuda_reserved_mb": 12124.0,
1595
- "cuda_max_allocated_mb": 11120.271484375,
1596
- "cuda_max_reserved_mb": 12124.0,
1597
- "gpu_util_percent": 100.0,
1598
- "gpu_memory_util_percent": 75.0,
1599
- "gpu_memory_used_mb": 13350.20703125,
1600
- "gpu_memory_total_mb": 16303.0,
1601
- "gpu_temperature_c": 71.0,
1602
- "gpu_power_w": 227.584,
1603
- "elapsed_seconds": 54.3920272999967
1604
- },
1605
- {
1606
- "process_rss_mb": 3651.80078125,
1607
- "cuda_allocated_mb": 4297.05078125,
1608
- "cuda_reserved_mb": 12124.0,
1609
- "cuda_max_allocated_mb": 11120.271484375,
1610
- "cuda_max_reserved_mb": 12124.0,
1611
- "gpu_util_percent": 65.0,
1612
- "gpu_memory_util_percent": 51.0,
1613
- "gpu_memory_used_mb": 13350.20703125,
1614
- "gpu_memory_total_mb": 16303.0,
1615
- "gpu_temperature_c": 67.0,
1616
- "gpu_power_w": 224.702,
1617
- "elapsed_seconds": 54.902947499998845
1618
- },
1619
- {
1620
- "process_rss_mb": 3651.80078125,
1621
- "cuda_allocated_mb": 140.751953125,
1622
- "cuda_reserved_mb": 12124.0,
1623
- "cuda_max_allocated_mb": 11120.271484375,
1624
- "cuda_max_reserved_mb": 12124.0,
1625
- "gpu_util_percent": 95.0,
1626
- "gpu_memory_util_percent": 70.0,
1627
- "gpu_memory_used_mb": 13350.20703125,
1628
- "gpu_memory_total_mb": 16303.0,
1629
- "gpu_temperature_c": 71.0,
1630
- "gpu_power_w": 224.96,
1631
- "elapsed_seconds": 55.4142581000051
1632
- },
1633
- {
1634
- "process_rss_mb": 3651.80078125,
1635
- "cuda_allocated_mb": 7386.53857421875,
1636
- "cuda_reserved_mb": 12124.0,
1637
- "cuda_max_allocated_mb": 11120.271484375,
1638
- "cuda_max_reserved_mb": 12124.0,
1639
- "gpu_util_percent": 83.0,
1640
- "gpu_memory_util_percent": 65.0,
1641
- "gpu_memory_used_mb": 13350.20703125,
1642
- "gpu_memory_total_mb": 16303.0,
1643
- "gpu_temperature_c": 72.0,
1644
- "gpu_power_w": 229.283,
1645
- "elapsed_seconds": 55.92322180001065
1646
- },
1647
- {
1648
- "process_rss_mb": 3651.80078125,
1649
- "cuda_allocated_mb": 140.751953125,
1650
- "cuda_reserved_mb": 12124.0,
1651
- "cuda_max_allocated_mb": 11120.271484375,
1652
- "cuda_max_reserved_mb": 12124.0,
1653
- "gpu_util_percent": 73.0,
1654
- "gpu_memory_util_percent": 54.0,
1655
- "gpu_memory_used_mb": 13350.20703125,
1656
- "gpu_memory_total_mb": 16303.0,
1657
- "gpu_temperature_c": 73.0,
1658
- "gpu_power_w": 225.416,
1659
- "elapsed_seconds": 56.434711399982916
1660
- },
1661
- {
1662
- "process_rss_mb": 3651.80078125,
1663
- "cuda_allocated_mb": 8115.15771484375,
1664
- "cuda_reserved_mb": 12124.0,
1665
- "cuda_max_allocated_mb": 11120.271484375,
1666
- "cuda_max_reserved_mb": 12124.0,
1667
- "gpu_util_percent": 100.0,
1668
- "gpu_memory_util_percent": 78.0,
1669
- "gpu_memory_used_mb": 13350.20703125,
1670
- "gpu_memory_total_mb": 16303.0,
1671
- "gpu_temperature_c": 71.0,
1672
- "gpu_power_w": 227.937,
1673
- "elapsed_seconds": 56.94552159999148
1674
- },
1675
- {
1676
- "process_rss_mb": 3651.80078125,
1677
- "cuda_allocated_mb": 140.751953125,
1678
- "cuda_reserved_mb": 12124.0,
1679
- "cuda_max_allocated_mb": 11120.271484375,
1680
- "cuda_max_reserved_mb": 12124.0,
1681
- "gpu_util_percent": 67.0,
1682
- "gpu_memory_util_percent": 48.0,
1683
- "gpu_memory_used_mb": 13350.20703125,
1684
- "gpu_memory_total_mb": 16303.0,
1685
- "gpu_temperature_c": 72.0,
1686
- "gpu_power_w": 223.304,
1687
- "elapsed_seconds": 57.45550720000756
1688
- },
1689
- {
1690
- "process_rss_mb": 3651.80078125,
1691
- "cuda_allocated_mb": 10308.271484375,
1692
- "cuda_reserved_mb": 12124.0,
1693
- "cuda_max_allocated_mb": 11120.271484375,
1694
- "cuda_max_reserved_mb": 12124.0,
1695
- "gpu_util_percent": 100.0,
1696
- "gpu_memory_util_percent": 77.0,
1697
- "gpu_memory_used_mb": 13350.20703125,
1698
- "gpu_memory_total_mb": 16303.0,
1699
- "gpu_temperature_c": 72.0,
1700
- "gpu_power_w": 225.256,
1701
- "elapsed_seconds": 57.96506930000032
1702
- },
1703
- {
1704
- "process_rss_mb": 3651.80078125,
1705
- "cuda_allocated_mb": 140.751953125,
1706
- "cuda_reserved_mb": 12124.0,
1707
- "cuda_max_allocated_mb": 11120.271484375,
1708
- "cuda_max_reserved_mb": 12124.0,
1709
- "gpu_util_percent": 67.0,
1710
- "gpu_memory_util_percent": 50.0,
1711
- "gpu_memory_used_mb": 13350.20703125,
1712
- "gpu_memory_total_mb": 16303.0,
1713
- "gpu_temperature_c": 66.0,
1714
- "gpu_power_w": 224.621,
1715
- "elapsed_seconds": 58.47955930000171
1716
- },
1717
- {
1718
- "process_rss_mb": 3651.80078125,
1719
- "cuda_allocated_mb": 140.751953125,
1720
- "cuda_reserved_mb": 12124.0,
1721
- "cuda_max_allocated_mb": 11120.271484375,
1722
- "cuda_max_reserved_mb": 12124.0,
1723
- "gpu_util_percent": 37.0,
1724
- "gpu_memory_util_percent": 28.0,
1725
- "gpu_memory_used_mb": 13350.20703125,
1726
- "gpu_memory_total_mb": 16303.0,
1727
- "gpu_temperature_c": 66.0,
1728
- "gpu_power_w": 176.663,
1729
- "elapsed_seconds": 58.9897634999943
1730
- },
1731
- {
1732
- "process_rss_mb": 3651.80078125,
1733
- "cuda_allocated_mb": 140.751953125,
1734
- "cuda_reserved_mb": 12124.0,
1735
- "cuda_max_allocated_mb": 11120.271484375,
1736
- "cuda_max_reserved_mb": 12124.0,
1737
- "gpu_util_percent": 100.0,
1738
- "gpu_memory_util_percent": 76.0,
1739
- "gpu_memory_used_mb": 13350.20703125,
1740
- "gpu_memory_total_mb": 16303.0,
1741
- "gpu_temperature_c": 72.0,
1742
- "gpu_power_w": 184.69,
1743
- "elapsed_seconds": 59.49904029999743
1744
- },
1745
- {
1746
- "process_rss_mb": 3651.80078125,
1747
- "cuda_allocated_mb": 140.751953125,
1748
- "cuda_reserved_mb": 12124.0,
1749
- "cuda_max_allocated_mb": 11120.271484375,
1750
- "cuda_max_reserved_mb": 12124.0,
1751
- "gpu_util_percent": 64.0,
1752
- "gpu_memory_util_percent": 50.0,
1753
- "gpu_memory_used_mb": 13350.20703125,
1754
- "gpu_memory_total_mb": 16303.0,
1755
- "gpu_temperature_c": 66.0,
1756
- "gpu_power_w": 216.346,
1757
- "elapsed_seconds": 60.01004590000957
1758
- },
1759
- {
1760
- "process_rss_mb": 3651.80078125,
1761
- "cuda_allocated_mb": 140.751953125,
1762
- "cuda_reserved_mb": 12124.0,
1763
- "cuda_max_allocated_mb": 11120.271484375,
1764
- "cuda_max_reserved_mb": 12124.0,
1765
- "gpu_util_percent": 100.0,
1766
- "gpu_memory_util_percent": 74.0,
1767
- "gpu_memory_used_mb": 13350.20703125,
1768
- "gpu_memory_total_mb": 16303.0,
1769
- "gpu_temperature_c": 72.0,
1770
- "gpu_power_w": 229.064,
1771
- "elapsed_seconds": 60.51880980000715
1772
- },
1773
- {
1774
- "process_rss_mb": 3651.80078125,
1775
- "cuda_allocated_mb": 140.751953125,
1776
- "cuda_reserved_mb": 12124.0,
1777
- "cuda_max_allocated_mb": 11120.271484375,
1778
- "cuda_max_reserved_mb": 12124.0,
1779
- "gpu_util_percent": 71.0,
1780
- "gpu_memory_util_percent": 55.0,
1781
- "gpu_memory_used_mb": 13350.20703125,
1782
- "gpu_memory_total_mb": 16303.0,
1783
- "gpu_temperature_c": 69.0,
1784
- "gpu_power_w": 228.047,
1785
- "elapsed_seconds": 61.02798049998819
1786
- },
1787
- {
1788
- "process_rss_mb": 3651.80078125,
1789
- "cuda_allocated_mb": 140.751953125,
1790
- "cuda_reserved_mb": 12124.0,
1791
- "cuda_max_allocated_mb": 11120.271484375,
1792
- "cuda_max_reserved_mb": 12124.0,
1793
- "gpu_util_percent": 93.0,
1794
- "gpu_memory_util_percent": 73.0,
1795
- "gpu_memory_used_mb": 13350.20703125,
1796
- "gpu_memory_total_mb": 16303.0,
1797
- "gpu_temperature_c": 71.0,
1798
- "gpu_power_w": 229.588,
1799
- "elapsed_seconds": 61.54037430000608
1800
- },
1801
- {
1802
- "process_rss_mb": 3651.80078125,
1803
- "cuda_allocated_mb": 140.751953125,
1804
- "cuda_reserved_mb": 12124.0,
1805
- "cuda_max_allocated_mb": 11120.271484375,
1806
- "cuda_max_reserved_mb": 12124.0,
1807
- "gpu_util_percent": 66.0,
1808
- "gpu_memory_util_percent": 47.0,
1809
- "gpu_memory_used_mb": 13350.20703125,
1810
- "gpu_memory_total_mb": 16303.0,
1811
- "gpu_temperature_c": 73.0,
1812
- "gpu_power_w": 225.498,
1813
- "elapsed_seconds": 62.049058300006436
1814
- },
1815
- {
1816
- "process_rss_mb": 3651.80078125,
1817
- "cuda_allocated_mb": 140.751953125,
1818
- "cuda_reserved_mb": 12124.0,
1819
- "cuda_max_allocated_mb": 11120.271484375,
1820
- "cuda_max_reserved_mb": 12124.0,
1821
- "gpu_util_percent": 100.0,
1822
- "gpu_memory_util_percent": 78.0,
1823
- "gpu_memory_used_mb": 13350.20703125,
1824
- "gpu_memory_total_mb": 16303.0,
1825
- "gpu_temperature_c": 72.0,
1826
- "gpu_power_w": 228.11,
1827
- "elapsed_seconds": 62.55962849999196
1828
- },
1829
- {
1830
- "process_rss_mb": 3651.80078125,
1831
- "cuda_allocated_mb": 140.751953125,
1832
- "cuda_reserved_mb": 12124.0,
1833
- "cuda_max_allocated_mb": 11120.271484375,
1834
- "cuda_max_reserved_mb": 12124.0,
1835
- "gpu_util_percent": 66.0,
1836
- "gpu_memory_util_percent": 49.0,
1837
- "gpu_memory_used_mb": 13350.20703125,
1838
- "gpu_memory_total_mb": 16303.0,
1839
- "gpu_temperature_c": 70.0,
1840
- "gpu_power_w": 222.692,
1841
- "elapsed_seconds": 63.07071030000225
1842
- },
1843
- {
1844
- "process_rss_mb": 3651.80078125,
1845
- "cuda_allocated_mb": 824.71728515625,
1846
- "cuda_reserved_mb": 12124.0,
1847
- "cuda_max_allocated_mb": 11120.271484375,
1848
- "cuda_max_reserved_mb": 12124.0,
1849
- "gpu_util_percent": 100.0,
1850
- "gpu_memory_util_percent": 77.0,
1851
- "gpu_memory_used_mb": 13350.20703125,
1852
- "gpu_memory_total_mb": 16303.0,
1853
- "gpu_temperature_c": 73.0,
1854
- "gpu_power_w": 228.831,
1855
- "elapsed_seconds": 63.5797109999985
1856
- },
1857
- {
1858
- "process_rss_mb": 3651.80078125,
1859
- "cuda_allocated_mb": 140.751953125,
1860
- "cuda_reserved_mb": 12124.0,
1861
- "cuda_max_allocated_mb": 11120.271484375,
1862
- "cuda_max_reserved_mb": 12124.0,
1863
- "gpu_util_percent": 66.0,
1864
- "gpu_memory_util_percent": 50.0,
1865
- "gpu_memory_used_mb": 13350.20703125,
1866
- "gpu_memory_total_mb": 16303.0,
1867
- "gpu_temperature_c": 66.0,
1868
- "gpu_power_w": 224.853,
1869
- "elapsed_seconds": 64.08942380000371
1870
- },
1871
- {
1872
- "process_rss_mb": 3659.08203125,
1873
- "cuda_allocated_mb": 140.751953125,
1874
- "cuda_reserved_mb": 12124.0,
1875
- "cuda_max_allocated_mb": 11120.271484375,
1876
- "cuda_max_reserved_mb": 12124.0,
1877
- "gpu_util_percent": 2.0,
1878
- "gpu_memory_util_percent": 2.0,
1879
- "gpu_memory_used_mb": 13350.20703125,
1880
- "gpu_memory_total_mb": 16303.0,
1881
- "gpu_temperature_c": 63.0,
1882
- "gpu_power_w": 205.274,
1883
- "elapsed_seconds": 64.60033210000256
1884
- },
1885
- {
1886
- "process_rss_mb": 3659.08203125,
1887
- "cuda_allocated_mb": 8194.51708984375,
1888
- "cuda_reserved_mb": 12124.0,
1889
- "cuda_max_allocated_mb": 11120.271484375,
1890
- "cuda_max_reserved_mb": 12124.0,
1891
- "gpu_util_percent": 84.0,
1892
- "gpu_memory_util_percent": 62.0,
1893
- "gpu_memory_used_mb": 13350.20703125,
1894
- "gpu_memory_total_mb": 16303.0,
1895
- "gpu_temperature_c": 72.0,
1896
- "gpu_power_w": 208.652,
1897
- "elapsed_seconds": 65.11013029998867
1898
- },
1899
- {
1900
- "process_rss_mb": 3659.08203125,
1901
- "cuda_allocated_mb": 140.751953125,
1902
- "cuda_reserved_mb": 12124.0,
1903
- "cuda_max_allocated_mb": 11120.271484375,
1904
- "cuda_max_reserved_mb": 12124.0,
1905
- "gpu_util_percent": 95.0,
1906
- "gpu_memory_util_percent": 74.0,
1907
- "gpu_memory_used_mb": 13350.20703125,
1908
- "gpu_memory_total_mb": 16303.0,
1909
- "gpu_temperature_c": 73.0,
1910
- "gpu_power_w": 215.071,
1911
- "elapsed_seconds": 65.6213666999829
1912
- },
1913
- {
1914
- "process_rss_mb": 3659.08203125,
1915
- "cuda_allocated_mb": 4765.51025390625,
1916
- "cuda_reserved_mb": 12124.0,
1917
- "cuda_max_allocated_mb": 11120.271484375,
1918
- "cuda_max_reserved_mb": 12124.0,
1919
- "gpu_util_percent": 65.0,
1920
- "gpu_memory_util_percent": 47.0,
1921
- "gpu_memory_used_mb": 13350.20703125,
1922
- "gpu_memory_total_mb": 16303.0,
1923
- "gpu_temperature_c": 72.0,
1924
- "gpu_power_w": 225.985,
1925
- "elapsed_seconds": 66.13057750000735
1926
- },
1927
- {
1928
- "process_rss_mb": 3659.08203125,
1929
- "cuda_allocated_mb": 140.751953125,
1930
- "cuda_reserved_mb": 12124.0,
1931
- "cuda_max_allocated_mb": 11120.271484375,
1932
- "cuda_max_reserved_mb": 12124.0,
1933
- "gpu_util_percent": 100.0,
1934
- "gpu_memory_util_percent": 78.0,
1935
- "gpu_memory_used_mb": 13350.20703125,
1936
- "gpu_memory_total_mb": 16303.0,
1937
- "gpu_temperature_c": 72.0,
1938
- "gpu_power_w": 227.795,
1939
- "elapsed_seconds": 66.6385944999929
1940
- },
1941
- {
1942
- "process_rss_mb": 3659.06640625,
1943
- "cuda_allocated_mb": 1300.37890625,
1944
- "cuda_reserved_mb": 12124.0,
1945
- "cuda_max_allocated_mb": 11120.271484375,
1946
- "cuda_max_reserved_mb": 12124.0,
1947
- "gpu_util_percent": 65.0,
1948
- "gpu_memory_util_percent": 49.0,
1949
- "gpu_memory_used_mb": 13350.20703125,
1950
- "gpu_memory_total_mb": 16303.0,
1951
- "gpu_temperature_c": 71.0,
1952
- "gpu_power_w": 224.744,
1953
- "elapsed_seconds": 67.14924009999959
1954
- },
1955
- {
1956
- "process_rss_mb": 3659.06640625,
1957
- "cuda_allocated_mb": 140.751953125,
1958
- "cuda_reserved_mb": 12124.0,
1959
- "cuda_max_allocated_mb": 11120.271484375,
1960
- "cuda_max_reserved_mb": 12124.0,
1961
- "gpu_util_percent": 100.0,
1962
- "gpu_memory_util_percent": 76.0,
1963
- "gpu_memory_used_mb": 13350.20703125,
1964
- "gpu_memory_total_mb": 16303.0,
1965
- "gpu_temperature_c": 73.0,
1966
- "gpu_power_w": 228.754,
1967
- "elapsed_seconds": 67.65655809998862
1968
- },
1969
- {
1970
- "process_rss_mb": 3659.06640625,
1971
- "cuda_allocated_mb": 126.2216796875,
1972
- "cuda_reserved_mb": 12124.0,
1973
- "cuda_max_allocated_mb": 11120.271484375,
1974
- "cuda_max_reserved_mb": 12124.0,
1975
- "gpu_util_percent": 66.0,
1976
- "gpu_memory_util_percent": 51.0,
1977
- "gpu_memory_used_mb": 13350.20703125,
1978
- "gpu_memory_total_mb": 16303.0,
1979
- "gpu_temperature_c": 67.0,
1980
- "gpu_power_w": 224.511,
1981
- "elapsed_seconds": 68.16678259999026
1982
- },
1983
- {
1984
- "process_rss_mb": 3659.06640625,
1985
- "cuda_allocated_mb": 140.751953125,
1986
- "cuda_reserved_mb": 12124.0,
1987
- "cuda_max_allocated_mb": 11120.271484375,
1988
- "cuda_max_reserved_mb": 12124.0,
1989
- "gpu_util_percent": 100.0,
1990
- "gpu_memory_util_percent": 74.0,
1991
- "gpu_memory_used_mb": 13350.20703125,
1992
- "gpu_memory_total_mb": 16303.0,
1993
- "gpu_temperature_c": 73.0,
1994
- "gpu_power_w": 225.465,
1995
- "elapsed_seconds": 68.67621050000889
1996
- },
1997
- {
1998
- "process_rss_mb": 3651.48046875,
1999
- "cuda_allocated_mb": 140.751953125,
2000
- "cuda_reserved_mb": 12124.0,
2001
- "cuda_max_allocated_mb": 11120.271484375,
2002
- "cuda_max_reserved_mb": 12124.0,
2003
- "gpu_util_percent": 74.0,
2004
- "gpu_memory_util_percent": 57.0,
2005
- "gpu_memory_used_mb": 13350.20703125,
2006
- "gpu_memory_total_mb": 16303.0,
2007
- "gpu_temperature_c": 71.0,
2008
- "gpu_power_w": 228.605,
2009
- "elapsed_seconds": 69.18651979998685
2010
  }
2011
  ],
2012
- "step_samples_per_second_avg": 8362.14873912102,
2013
- "step_samples_per_second_max": 8602.940462013845,
2014
- "step_samples_per_second_min": 7900.4246698690495,
2015
- "step_tokens_per_second_avg": 1070355.0386074905,
2016
- "step_tokens_per_second_max": 1101176.3791377721,
2017
- "step_tokens_per_second_min": 1011254.3577432383,
2018
- "step_process_rss_mb_avg": 3626.9251302083335,
2019
- "step_process_rss_mb_max": 3651.80078125,
2020
- "step_process_rss_mb_min": 3596.62890625,
2021
  "step_cuda_max_allocated_mb_avg": 11120.271484375,
2022
  "step_cuda_max_allocated_mb_max": 11120.271484375,
2023
  "step_cuda_max_allocated_mb_min": 11120.271484375,
2024
- "step_gpu_util_percent_avg": 70.5,
2025
  "step_gpu_util_percent_max": 100.0,
2026
- "step_gpu_util_percent_min": 63.0,
2027
- "step_gpu_memory_util_percent_avg": 53.5,
2028
- "step_gpu_memory_util_percent_max": 77.0,
2029
- "step_gpu_memory_util_percent_min": 48.0,
2030
- "step_gpu_power_w_avg": 220.1005,
2031
- "step_gpu_power_w_max": 224.853,
2032
- "step_gpu_power_w_min": 214.297,
2033
- "step_gpu_temperature_c_avg": 63.5,
2034
- "step_gpu_temperature_c_max": 68.0,
2035
- "step_gpu_temperature_c_min": 57.0,
2036
- "background_process_rss_mb_avg": 3625.386863425926,
2037
- "background_process_rss_mb_max": 3659.08203125,
2038
- "background_process_rss_mb_min": 3310.83203125,
2039
- "background_cuda_max_allocated_mb_avg": 11119.763834635416,
2040
  "background_cuda_max_allocated_mb_max": 11120.271484375,
2041
  "background_cuda_max_allocated_mb_min": 11051.73876953125,
2042
- "background_gpu_util_percent_avg": 78.32592592592593,
2043
  "background_gpu_util_percent_max": 100.0,
2044
- "background_gpu_util_percent_min": 0.0,
2045
- "background_gpu_memory_util_percent_avg": 59.385185185185186,
2046
- "background_gpu_memory_util_percent_max": 79.0,
2047
- "background_gpu_memory_util_percent_min": 2.0,
2048
- "background_gpu_power_w_avg": 214.2482,
2049
- "background_gpu_power_w_max": 229.588,
2050
- "background_gpu_power_w_min": 57.852,
2051
- "background_gpu_temperature_c_avg": 65.91111111111111,
2052
- "background_gpu_temperature_c_max": 73.0,
2053
- "background_gpu_temperature_c_min": 51.0,
2054
- "samples_per_second_avg": 8362.14873912102,
2055
- "samples_per_second_max": 8602.940462013845,
2056
- "tokens_per_second_avg": 1070355.0386074905,
2057
- "tokens_per_second_max": 1101176.3791377721,
2058
- "process_rss_mb_avg": 3626.9251302083335,
2059
- "process_rss_mb_max": 3651.80078125,
2060
  "cuda_max_allocated_mb_avg": 11120.271484375,
2061
  "cuda_max_allocated_mb_max": 11120.271484375,
2062
- "gpu_util_percent_avg": 70.5,
2063
  "gpu_util_percent_max": 100.0,
2064
- "gpu_memory_util_percent_avg": 53.5,
2065
- "gpu_memory_util_percent_max": 77.0,
2066
- "gpu_power_w_avg": 220.1005,
2067
- "gpu_power_w_max": 224.853,
2068
- "gpu_temperature_c_avg": 63.5,
2069
- "gpu_temperature_c_max": 68.0
2070
  }
 
1
  {
2
+ "sample_count": 1,
3
  "samples": [
4
  {
5
  "step": 50.0,
6
+ "elapsed_seconds": 11.495871799997985,
7
+ "window_seconds": 11.495871799997985,
8
+ "steps_per_second": 4.349387403572886,
9
+ "samples_per_second": 7794.102227202612,
10
+ "tokens_per_second": 997645.0850819343,
11
+ "process_rss_mb": 2683.17578125,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  "cuda_allocated_mb": 122.5029296875,
13
  "cuda_reserved_mb": 12124.0,
14
  "cuda_max_allocated_mb": 11120.271484375,
15
  "cuda_max_reserved_mb": 12124.0,
16
  "gpu_util_percent": 100.0,
17
+ "gpu_memory_util_percent": 75.0,
18
+ "gpu_memory_used_mb": 13744.27734375,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  "gpu_memory_total_mb": 16303.0,
20
+ "gpu_temperature_c": 62.0,
21
+ "gpu_power_w": 217.742
22
  }
23
  ],
24
+ "background_sample_count": 32,
25
  "background_samples": [
26
  {
27
+ "process_rss_mb": 2403.41796875,
28
+ "cuda_allocated_mb": 10657.99755859375,
29
  "cuda_reserved_mb": 11200.0,
30
  "cuda_max_allocated_mb": 11051.73876953125,
31
  "cuda_max_reserved_mb": 11200.0,
32
+ "gpu_util_percent": 14.0,
33
+ "gpu_memory_util_percent": 5.0,
34
+ "gpu_memory_used_mb": 12818.02734375,
35
  "gpu_memory_total_mb": 16303.0,
36
+ "gpu_temperature_c": 45.0,
37
+ "gpu_power_w": 42.684,
38
+ "elapsed_seconds": 0.541983800008893
39
  },
40
  {
41
+ "process_rss_mb": 2682.99609375,
42
  "cuda_allocated_mb": 126.2216796875,
43
  "cuda_reserved_mb": 12124.0,
44
  "cuda_max_allocated_mb": 11120.271484375,
45
  "cuda_max_reserved_mb": 12124.0,
46
+ "gpu_util_percent": 23.0,
47
+ "gpu_memory_util_percent": 17.0,
48
+ "gpu_memory_used_mb": 13744.02734375,
49
  "gpu_memory_total_mb": 16303.0,
50
+ "gpu_temperature_c": 49.0,
51
+ "gpu_power_w": 80.492,
52
+ "elapsed_seconds": 1.0560631999978796
53
  },
54
  {
55
+ "process_rss_mb": 2683.15625,
56
  "cuda_allocated_mb": 141.64453125,
57
  "cuda_reserved_mb": 12124.0,
58
  "cuda_max_allocated_mb": 11120.271484375,
59
  "cuda_max_reserved_mb": 12124.0,
60
+ "gpu_util_percent": 84.0,
61
+ "gpu_memory_util_percent": 65.0,
62
+ "gpu_memory_used_mb": 13744.02734375,
63
  "gpu_memory_total_mb": 16303.0,
64
+ "gpu_temperature_c": 59.0,
65
+ "gpu_power_w": 151.182,
66
+ "elapsed_seconds": 1.5636136999819428
67
  },
68
  {
69
+ "process_rss_mb": 2683.171875,
70
  "cuda_allocated_mb": 126.2216796875,
71
  "cuda_reserved_mb": 12124.0,
72
  "cuda_max_allocated_mb": 11120.271484375,
73
  "cuda_max_reserved_mb": 12124.0,
74
+ "gpu_util_percent": 67.0,
75
+ "gpu_memory_util_percent": 48.0,
76
+ "gpu_memory_used_mb": 13744.02734375,
77
  "gpu_memory_total_mb": 16303.0,
78
  "gpu_temperature_c": 58.0,
79
+ "gpu_power_w": 198.581,
80
+ "elapsed_seconds": 2.090296700014733
81
  },
82
  {
83
+ "process_rss_mb": 2683.171875,
84
  "cuda_allocated_mb": 141.64453125,
85
  "cuda_reserved_mb": 12124.0,
86
  "cuda_max_allocated_mb": 11120.271484375,
87
  "cuda_max_reserved_mb": 12124.0,
88
+ "gpu_util_percent": 100.0,
89
+ "gpu_memory_util_percent": 77.0,
90
+ "gpu_memory_used_mb": 13744.02734375,
91
  "gpu_memory_total_mb": 16303.0,
92
+ "gpu_temperature_c": 58.0,
93
+ "gpu_power_w": 215.4,
94
+ "elapsed_seconds": 2.5990715000079945
95
  },
96
  {
97
+ "process_rss_mb": 2683.171875,
98
  "cuda_allocated_mb": 126.2216796875,
99
  "cuda_reserved_mb": 12124.0,
100
  "cuda_max_allocated_mb": 11120.271484375,
101
  "cuda_max_reserved_mb": 12124.0,
102
+ "gpu_util_percent": 63.0,
103
+ "gpu_memory_util_percent": 47.0,
104
+ "gpu_memory_used_mb": 13744.02734375,
105
  "gpu_memory_total_mb": 16303.0,
106
+ "gpu_temperature_c": 54.0,
107
+ "gpu_power_w": 209.578,
108
+ "elapsed_seconds": 3.1307488000020385
109
  },
110
  {
111
+ "process_rss_mb": 2683.171875,
112
  "cuda_allocated_mb": 141.64453125,
113
  "cuda_reserved_mb": 12124.0,
114
  "cuda_max_allocated_mb": 11120.271484375,
115
  "cuda_max_reserved_mb": 12124.0,
116
+ "gpu_util_percent": 100.0,
117
+ "gpu_memory_util_percent": 74.0,
118
+ "gpu_memory_used_mb": 13744.02734375,
119
  "gpu_memory_total_mb": 16303.0,
120
+ "gpu_temperature_c": 59.0,
121
+ "gpu_power_w": 215.475,
122
+ "elapsed_seconds": 3.644748599966988
123
  },
124
  {
125
+ "process_rss_mb": 2683.171875,
126
  "cuda_allocated_mb": 126.2216796875,
127
  "cuda_reserved_mb": 12124.0,
128
  "cuda_max_allocated_mb": 11120.271484375,
129
  "cuda_max_reserved_mb": 12124.0,
130
+ "gpu_util_percent": 1.0,
131
+ "gpu_memory_util_percent": 0.0,
132
+ "gpu_memory_used_mb": 13744.02734375,
133
  "gpu_memory_total_mb": 16303.0,
134
+ "gpu_temperature_c": 51.0,
135
+ "gpu_power_w": 199.707,
136
+ "elapsed_seconds": 4.402358899998944
137
  },
138
  {
139
+ "process_rss_mb": 2683.171875,
140
  "cuda_allocated_mb": 141.64453125,
141
  "cuda_reserved_mb": 12124.0,
142
  "cuda_max_allocated_mb": 11120.271484375,
143
  "cuda_max_reserved_mb": 12124.0,
144
+ "gpu_util_percent": 100.0,
145
+ "gpu_memory_util_percent": 76.0,
146
+ "gpu_memory_used_mb": 13744.02734375,
147
  "gpu_memory_total_mb": 16303.0,
148
+ "gpu_temperature_c": 60.0,
149
+ "gpu_power_w": 178.762,
150
+ "elapsed_seconds": 4.912791399983689
151
  },
152
  {
153
+ "process_rss_mb": 2683.17578125,
154
  "cuda_allocated_mb": 141.64453125,
155
  "cuda_reserved_mb": 12124.0,
156
  "cuda_max_allocated_mb": 11120.271484375,
157
  "cuda_max_reserved_mb": 12124.0,
158
+ "gpu_util_percent": 67.0,
159
  "gpu_memory_util_percent": 51.0,
160
+ "gpu_memory_used_mb": 13744.02734375,
161
  "gpu_memory_total_mb": 16303.0,
162
+ "gpu_temperature_c": 57.0,
163
+ "gpu_power_w": 187.267,
164
+ "elapsed_seconds": 5.420690300001297
165
  },
166
  {
167
+ "process_rss_mb": 2683.17578125,
168
  "cuda_allocated_mb": 141.64453125,
169
  "cuda_reserved_mb": 12124.0,
170
  "cuda_max_allocated_mb": 11120.271484375,
171
  "cuda_max_reserved_mb": 12124.0,
172
+ "gpu_util_percent": 82.0,
173
+ "gpu_memory_util_percent": 59.0,
174
+ "gpu_memory_used_mb": 13749.71484375,
175
  "gpu_memory_total_mb": 16303.0,
176
  "gpu_temperature_c": 61.0,
177
+ "gpu_power_w": 211.903,
178
+ "elapsed_seconds": 5.9282142000156455
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
179
  },
180
  {
181
+ "process_rss_mb": 2683.17578125,
182
  "cuda_allocated_mb": 141.64453125,
183
  "cuda_reserved_mb": 12124.0,
184
  "cuda_max_allocated_mb": 11120.271484375,
185
  "cuda_max_reserved_mb": 12124.0,
186
  "gpu_util_percent": 100.0,
187
  "gpu_memory_util_percent": 77.0,
188
+ "gpu_memory_used_mb": 13749.71484375,
189
  "gpu_memory_total_mb": 16303.0,
190
+ "gpu_temperature_c": 60.0,
191
+ "gpu_power_w": 216.831,
192
+ "elapsed_seconds": 6.439060800010338
193
  },
194
  {
195
+ "process_rss_mb": 2683.17578125,
196
  "cuda_allocated_mb": 141.64453125,
197
  "cuda_reserved_mb": 12124.0,
198
  "cuda_max_allocated_mb": 11120.271484375,
199
  "cuda_max_reserved_mb": 12124.0,
200
+ "gpu_util_percent": 65.0,
201
+ "gpu_memory_util_percent": 47.0,
202
+ "gpu_memory_used_mb": 13749.71484375,
203
  "gpu_memory_total_mb": 16303.0,
204
+ "gpu_temperature_c": 60.0,
205
+ "gpu_power_w": 214.227,
206
+ "elapsed_seconds": 6.946143499983009
207
  },
208
  {
209
+ "process_rss_mb": 2683.17578125,
210
  "cuda_allocated_mb": 141.64453125,
211
  "cuda_reserved_mb": 12124.0,
212
  "cuda_max_allocated_mb": 11120.271484375,
213
  "cuda_max_reserved_mb": 12124.0,
214
  "gpu_util_percent": 100.0,
215
+ "gpu_memory_util_percent": 75.0,
216
+ "gpu_memory_used_mb": 13749.71484375,
217
  "gpu_memory_total_mb": 16303.0,
218
  "gpu_temperature_c": 62.0,
219
+ "gpu_power_w": 214.596,
220
+ "elapsed_seconds": 7.4573811999871396
221
  },
222
  {
223
+ "process_rss_mb": 2683.17578125,
224
+ "cuda_allocated_mb": 5474.2958984375,
225
  "cuda_reserved_mb": 12124.0,
226
  "cuda_max_allocated_mb": 11120.271484375,
227
  "cuda_max_reserved_mb": 12124.0,
228
+ "gpu_util_percent": 64.0,
229
+ "gpu_memory_util_percent": 48.0,
230
+ "gpu_memory_used_mb": 13749.71484375,
231
  "gpu_memory_total_mb": 16303.0,
232
  "gpu_temperature_c": 56.0,
233
+ "gpu_power_w": 213.192,
234
+ "elapsed_seconds": 7.967849700013176
235
  },
236
  {
237
+ "process_rss_mb": 2683.17578125,
238
  "cuda_allocated_mb": 141.64453125,
239
  "cuda_reserved_mb": 12124.0,
240
  "cuda_max_allocated_mb": 11120.271484375,
241
  "cuda_max_reserved_mb": 12124.0,
242
+ "gpu_util_percent": 92.0,
243
+ "gpu_memory_util_percent": 67.0,
244
+ "gpu_memory_used_mb": 13745.71484375,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
245
  "gpu_memory_total_mb": 16303.0,
246
  "gpu_temperature_c": 61.0,
247
+ "gpu_power_w": 217.827,
248
+ "elapsed_seconds": 8.474799200019334
249
  },
250
  {
251
+ "process_rss_mb": 2683.17578125,
252
+ "cuda_allocated_mb": 141.64453125,
253
  "cuda_reserved_mb": 12124.0,
254
  "cuda_max_allocated_mb": 11120.271484375,
255
  "cuda_max_reserved_mb": 12124.0,
256
+ "gpu_util_percent": 1.0,
257
+ "gpu_memory_util_percent": 0.0,
258
+ "gpu_memory_used_mb": 13745.71484375,
259
  "gpu_memory_total_mb": 16303.0,
260
+ "gpu_temperature_c": 51.0,
261
+ "gpu_power_w": 188.037,
262
+ "elapsed_seconds": 8.985988000000361
263
  },
264
  {
265
+ "process_rss_mb": 2683.17578125,
266
  "cuda_allocated_mb": 141.64453125,
267
  "cuda_reserved_mb": 12124.0,
268
  "cuda_max_allocated_mb": 11120.271484375,
269
  "cuda_max_reserved_mb": 12124.0,
270
+ "gpu_util_percent": 66.0,
271
  "gpu_memory_util_percent": 49.0,
272
+ "gpu_memory_used_mb": 13744.52734375,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
273
  "gpu_memory_total_mb": 16303.0,
274
+ "gpu_temperature_c": 56.0,
275
+ "gpu_power_w": 171.267,
276
+ "elapsed_seconds": 9.494696600013413
277
  },
278
  {
279
+ "process_rss_mb": 2683.17578125,
280
  "cuda_allocated_mb": 141.64453125,
281
  "cuda_reserved_mb": 12124.0,
282
  "cuda_max_allocated_mb": 11120.271484375,
283
  "cuda_max_reserved_mb": 12124.0,
284
+ "gpu_util_percent": 93.0,
285
+ "gpu_memory_util_percent": 69.0,
286
+ "gpu_memory_used_mb": 13744.52734375,
287
  "gpu_memory_total_mb": 16303.0,
288
+ "gpu_temperature_c": 62.0,
289
+ "gpu_power_w": 192.325,
290
+ "elapsed_seconds": 10.006947299989406
291
  },
292
  {
293
+ "process_rss_mb": 2683.17578125,
294
+ "cuda_allocated_mb": 141.64453125,
295
  "cuda_reserved_mb": 12124.0,
296
  "cuda_max_allocated_mb": 11120.271484375,
297
  "cuda_max_reserved_mb": 12124.0,
298
+ "gpu_util_percent": 93.0,
299
+ "gpu_memory_util_percent": 71.0,
300
+ "gpu_memory_used_mb": 13744.52734375,
301
  "gpu_memory_total_mb": 16303.0,
302
+ "gpu_temperature_c": 62.0,
303
+ "gpu_power_w": 221.52,
304
+ "elapsed_seconds": 10.517244499991648
305
  },
306
  {
307
+ "process_rss_mb": 2683.17578125,
308
  "cuda_allocated_mb": 141.64453125,
309
  "cuda_reserved_mb": 12124.0,
310
  "cuda_max_allocated_mb": 11120.271484375,
311
  "cuda_max_reserved_mb": 12124.0,
312
+ "gpu_util_percent": 65.0,
313
+ "gpu_memory_util_percent": 47.0,
314
+ "gpu_memory_used_mb": 13744.27734375,
315
  "gpu_memory_total_mb": 16303.0,
316
+ "gpu_temperature_c": 61.0,
317
+ "gpu_power_w": 214.796,
318
+ "elapsed_seconds": 11.027270299964584
319
  },
320
  {
321
+ "process_rss_mb": 2683.17578125,
322
+ "cuda_allocated_mb": 6567.63916015625,
323
  "cuda_reserved_mb": 12124.0,
324
  "cuda_max_allocated_mb": 11120.271484375,
325
  "cuda_max_reserved_mb": 12124.0,
326
+ "gpu_util_percent": 100.0,
327
+ "gpu_memory_util_percent": 75.0,
328
+ "gpu_memory_used_mb": 13744.27734375,
329
  "gpu_memory_total_mb": 16303.0,
330
+ "gpu_temperature_c": 62.0,
331
+ "gpu_power_w": 217.742,
332
+ "elapsed_seconds": 11.53551159997005
333
  },
334
  {
335
+ "process_rss_mb": 2683.17578125,
336
  "cuda_allocated_mb": 141.64453125,
337
  "cuda_reserved_mb": 12124.0,
338
  "cuda_max_allocated_mb": 11120.271484375,
339
  "cuda_max_reserved_mb": 12124.0,
340
+ "gpu_util_percent": 64.0,
341
+ "gpu_memory_util_percent": 48.0,
342
+ "gpu_memory_used_mb": 13742.83984375,
343
  "gpu_memory_total_mb": 16303.0,
344
+ "gpu_temperature_c": 57.0,
345
+ "gpu_power_w": 211.385,
346
+ "elapsed_seconds": 12.04551550000906
347
  },
348
  {
349
+ "process_rss_mb": 2683.17578125,
350
  "cuda_allocated_mb": 126.2216796875,
351
  "cuda_reserved_mb": 12124.0,
352
  "cuda_max_allocated_mb": 11120.271484375,
353
  "cuda_max_reserved_mb": 12124.0,
354
+ "gpu_util_percent": 99.0,
355
  "gpu_memory_util_percent": 73.0,
356
+ "gpu_memory_used_mb": 13742.83984375,
357
  "gpu_memory_total_mb": 16303.0,
358
  "gpu_temperature_c": 63.0,
359
+ "gpu_power_w": 215.446,
360
+ "elapsed_seconds": 12.555750900006387
 
 
 
 
 
 
 
 
 
 
 
 
 
 
361
  },
362
  {
363
+ "process_rss_mb": 2683.17578125,
364
  "cuda_allocated_mb": 126.2216796875,
365
  "cuda_reserved_mb": 12124.0,
366
  "cuda_max_allocated_mb": 11120.271484375,
367
  "cuda_max_reserved_mb": 12124.0,
368
+ "gpu_util_percent": 86.0,
369
+ "gpu_memory_util_percent": 65.0,
370
+ "gpu_memory_used_mb": 13742.83984375,
371
  "gpu_memory_total_mb": 16303.0,
372
  "gpu_temperature_c": 64.0,
373
+ "gpu_power_w": 216.554,
374
+ "elapsed_seconds": 13.202394699968863
375
  },
376
  {
377
+ "process_rss_mb": 2683.17578125,
378
  "cuda_allocated_mb": 141.64453125,
379
  "cuda_reserved_mb": 12124.0,
380
  "cuda_max_allocated_mb": 11120.271484375,
381
  "cuda_max_reserved_mb": 12124.0,
382
+ "gpu_util_percent": 38.0,
383
+ "gpu_memory_util_percent": 27.0,
384
+ "gpu_memory_used_mb": 13742.83984375,
385
  "gpu_memory_total_mb": 16303.0,
386
+ "gpu_temperature_c": 57.0,
387
+ "gpu_power_w": 182.772,
388
+ "elapsed_seconds": 13.7173714999808
389
  },
390
  {
391
+ "process_rss_mb": 2683.17578125,
392
  "cuda_allocated_mb": 126.2216796875,
393
  "cuda_reserved_mb": 12124.0,
394
  "cuda_max_allocated_mb": 11120.271484375,
395
  "cuda_max_reserved_mb": 12124.0,
396
  "gpu_util_percent": 100.0,
397
+ "gpu_memory_util_percent": 75.0,
398
+ "gpu_memory_used_mb": 13742.83984375,
399
  "gpu_memory_total_mb": 16303.0,
400
+ "gpu_temperature_c": 64.0,
401
+ "gpu_power_w": 178.689,
402
+ "elapsed_seconds": 14.244046800013166
403
  },
404
  {
405
+ "process_rss_mb": 2683.17578125,
406
  "cuda_allocated_mb": 141.64453125,
407
  "cuda_reserved_mb": 12124.0,
408
  "cuda_max_allocated_mb": 11120.271484375,
409
  "cuda_max_reserved_mb": 12124.0,
410
+ "gpu_util_percent": 66.0,
411
  "gpu_memory_util_percent": 50.0,
412
+ "gpu_memory_used_mb": 13742.83984375,
413
  "gpu_memory_total_mb": 16303.0,
414
  "gpu_temperature_c": 59.0,
415
+ "gpu_power_w": 207.294,
416
+ "elapsed_seconds": 14.752780599985272
417
  },
418
  {
419
+ "process_rss_mb": 2683.17578125,
420
+ "cuda_allocated_mb": 141.64453125,
421
  "cuda_reserved_mb": 12124.0,
422
  "cuda_max_allocated_mb": 11120.271484375,
423
  "cuda_max_reserved_mb": 12124.0,
424
+ "gpu_util_percent": 80.0,
425
+ "gpu_memory_util_percent": 58.0,
426
+ "gpu_memory_used_mb": 13742.83984375,
427
  "gpu_memory_total_mb": 16303.0,
428
+ "gpu_temperature_c": 64.0,
429
+ "gpu_power_w": 214.991,
430
+ "elapsed_seconds": 15.258768500003498
431
  },
432
  {
433
+ "process_rss_mb": 2683.17578125,
434
  "cuda_allocated_mb": 141.64453125,
435
  "cuda_reserved_mb": 12124.0,
436
  "cuda_max_allocated_mb": 11120.271484375,
437
  "cuda_max_reserved_mb": 12124.0,
438
+ "gpu_util_percent": 66.0,
439
+ "gpu_memory_util_percent": 48.0,
440
+ "gpu_memory_used_mb": 13742.83984375,
441
  "gpu_memory_total_mb": 16303.0,
442
+ "gpu_temperature_c": 61.0,
443
+ "gpu_power_w": 214.067,
444
+ "elapsed_seconds": 15.76770419999957
445
  },
446
  {
447
+ "process_rss_mb": 2683.17578125,
448
  "cuda_allocated_mb": 141.64453125,
449
  "cuda_reserved_mb": 12124.0,
450
  "cuda_max_allocated_mb": 11120.271484375,
451
  "cuda_max_reserved_mb": 12124.0,
452
+ "gpu_util_percent": 100.0,
453
+ "gpu_memory_util_percent": 75.0,
454
+ "gpu_memory_used_mb": 13742.83984375,
455
  "gpu_memory_total_mb": 16303.0,
456
  "gpu_temperature_c": 65.0,
457
+ "gpu_power_w": 218.24,
458
+ "elapsed_seconds": 16.273408600012772
459
  },
460
  {
461
+ "process_rss_mb": 2720.89453125,
462
+ "cuda_allocated_mb": 139.14794921875,
463
  "cuda_reserved_mb": 12124.0,
464
  "cuda_max_allocated_mb": 11120.271484375,
465
  "cuda_max_reserved_mb": 12124.0,
466
+ "gpu_util_percent": 93.0,
467
  "gpu_memory_util_percent": 70.0,
468
+ "gpu_memory_used_mb": 13742.96484375,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
469
  "gpu_memory_total_mb": 16303.0,
470
  "gpu_temperature_c": 62.0,
471
+ "gpu_power_w": 221.991,
472
+ "elapsed_seconds": 16.781973899982404
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
473
  }
474
  ],
475
+ "step_samples_per_second_avg": 7794.102227202612,
476
+ "step_samples_per_second_max": 7794.102227202612,
477
+ "step_samples_per_second_min": 7794.102227202612,
478
+ "step_tokens_per_second_avg": 997645.0850819343,
479
+ "step_tokens_per_second_max": 997645.0850819343,
480
+ "step_tokens_per_second_min": 997645.0850819343,
481
+ "step_process_rss_mb_avg": 2683.17578125,
482
+ "step_process_rss_mb_max": 2683.17578125,
483
+ "step_process_rss_mb_min": 2683.17578125,
484
  "step_cuda_max_allocated_mb_avg": 11120.271484375,
485
  "step_cuda_max_allocated_mb_max": 11120.271484375,
486
  "step_cuda_max_allocated_mb_min": 11120.271484375,
487
+ "step_gpu_util_percent_avg": 100.0,
488
  "step_gpu_util_percent_max": 100.0,
489
+ "step_gpu_util_percent_min": 100.0,
490
+ "step_gpu_memory_util_percent_avg": 75.0,
491
+ "step_gpu_memory_util_percent_max": 75.0,
492
+ "step_gpu_memory_util_percent_min": 75.0,
493
+ "step_gpu_power_w_avg": 217.742,
494
+ "step_gpu_power_w_max": 217.742,
495
+ "step_gpu_power_w_min": 217.742,
496
+ "step_gpu_temperature_c_avg": 62.0,
497
+ "step_gpu_temperature_c_max": 62.0,
498
+ "step_gpu_temperature_c_min": 62.0,
499
+ "background_process_rss_mb_avg": 2675.6051025390625,
500
+ "background_process_rss_mb_max": 2720.89453125,
501
+ "background_process_rss_mb_min": 2403.41796875,
502
+ "background_cuda_max_allocated_mb_avg": 11118.129837036133,
503
  "background_cuda_max_allocated_mb_max": 11120.271484375,
504
  "background_cuda_max_allocated_mb_min": 11051.73876953125,
505
+ "background_gpu_util_percent_avg": 72.875,
506
  "background_gpu_util_percent_max": 100.0,
507
+ "background_gpu_util_percent_min": 1.0,
508
+ "background_gpu_memory_util_percent_avg": 54.15625,
509
+ "background_gpu_memory_util_percent_max": 77.0,
510
+ "background_gpu_memory_util_percent_min": 0.0,
511
+ "background_gpu_power_w_avg": 195.463125,
512
+ "background_gpu_power_w_max": 221.991,
513
+ "background_gpu_power_w_min": 42.684,
514
+ "background_gpu_temperature_c_avg": 58.75,
515
+ "background_gpu_temperature_c_max": 65.0,
516
+ "background_gpu_temperature_c_min": 45.0,
517
+ "samples_per_second_avg": 7794.102227202612,
518
+ "samples_per_second_max": 7794.102227202612,
519
+ "tokens_per_second_avg": 997645.0850819343,
520
+ "tokens_per_second_max": 997645.0850819343,
521
+ "process_rss_mb_avg": 2683.17578125,
522
+ "process_rss_mb_max": 2683.17578125,
523
  "cuda_max_allocated_mb_avg": 11120.271484375,
524
  "cuda_max_allocated_mb_max": 11120.271484375,
525
+ "gpu_util_percent_avg": 100.0,
526
  "gpu_util_percent_max": 100.0,
527
+ "gpu_memory_util_percent_avg": 75.0,
528
+ "gpu_memory_util_percent_max": 75.0,
529
+ "gpu_power_w_avg": 217.742,
530
+ "gpu_power_w_max": 217.742,
531
+ "gpu_temperature_c_avg": 62.0,
532
+ "gpu_temperature_c_max": 62.0
533
  }
reports/run_metadata.json CHANGED
@@ -1,45 +1,46 @@
1
  {
2
- "experiment_name": "dmhy-char-aug-fragments-10epoch-hardfocus",
3
- "data_file": "data/generated/focus_after_10epoch_char.jsonl",
4
  "data_sources": [
5
  {
6
  "role": "primary",
7
- "path": "data/generated/focus_after_10epoch_char.jsonl",
8
- "samples": 196132,
9
  "repeat": 1,
10
- "effective_samples": 196132
11
  }
12
  ],
13
  "augmentation": {
14
- "partial_requested": 30000,
15
- "partial_written": 27959,
16
- "permutation_requested": 60000,
17
- "permutation_written": 60000,
18
- "special_requested": 20000,
19
- "special_written": 20000,
20
  "max_chars": 160
21
  },
22
  "dataset_mode": "encoded",
 
23
  "apply_label_repairs": false,
24
  "keep_raw_dataset": false,
25
  "tokenizer_variant": "char",
26
- "vocab_file": "vocab.json",
27
  "vocab_size": 6199,
28
  "max_seq_length": 128,
29
  "hidden_size": 256,
30
  "num_hidden_layers": 4,
31
  "num_attention_heads": 8,
32
  "intermediate_size": 1024,
33
- "train_samples": 288886,
34
- "eval_samples": 15205,
35
- "load_seconds": 28.198466799978632,
36
- "encode_seconds": 21.198125199996866,
37
- "epochs": 2.0,
38
  "max_steps": -1,
39
  "batch_size": 1792,
40
- "learning_rate": 8e-06,
41
- "warmup_steps": 50,
42
- "seed": 107,
43
  "device": "cuda",
44
  "fp16": false,
45
  "gradient_accumulation_steps": 1,
 
1
  {
2
+ "experiment_name": "dmhy-char-virtual-sps32-10epoch-lightfocus",
3
+ "data_file": "data/generated/focus_after_virtual_sps32_char.jsonl",
4
  "data_sources": [
5
  {
6
  "role": "primary",
7
+ "path": "data/generated/focus_after_virtual_sps32_char.jsonl",
8
+ "samples": 140660,
9
  "repeat": 1,
10
+ "effective_samples": 140660
11
  }
12
  ],
13
  "augmentation": {
14
+ "partial_requested": 0,
15
+ "partial_written": 0,
16
+ "permutation_requested": 0,
17
+ "permutation_written": 0,
18
+ "special_requested": 0,
19
+ "special_written": 0,
20
  "max_chars": 160
21
  },
22
  "dataset_mode": "encoded",
23
+ "virtual_dataset_dir": null,
24
  "apply_label_repairs": false,
25
  "keep_raw_dataset": false,
26
  "tokenizer_variant": "char",
27
+ "vocab_file": "datasets/AnimeName/vocab.char.json",
28
  "vocab_size": 6199,
29
  "max_seq_length": 128,
30
  "hidden_size": 256,
31
  "num_hidden_layers": 4,
32
  "num_attention_heads": 8,
33
  "intermediate_size": 1024,
34
+ "train_samples": 133627,
35
+ "eval_samples": 7033,
36
+ "load_seconds": 3.860345099994447,
37
+ "encode_seconds": 11.22450440004468,
38
+ "epochs": 1.0,
39
  "max_steps": -1,
40
  "batch_size": 1792,
41
+ "learning_rate": 2e-06,
42
+ "warmup_steps": 20,
43
+ "seed": 208,
44
  "device": "cuda",
45
  "fp16": false,
46
  "gradient_accumulation_steps": 1,
reports/trainer_eval_metrics.json CHANGED
@@ -1,12 +1,12 @@
1
  {
2
- "eval_loss": 0.014898430556058884,
3
- "eval_precision": 0.9817063636837442,
4
- "eval_recall": 0.9877553160806524,
5
- "eval_f1": 0.9847215505861748,
6
- "eval_accuracy": 0.9961717168124339,
7
- "eval_runtime": 7.2816,
8
- "eval_samples_per_second": 2088.154,
9
- "eval_steps_per_second": 1.236,
10
- "epoch": 2.0,
11
- "num_input_tokens_seen": 73954816.0
12
  }
 
1
  {
2
+ "eval_loss": 0.015715675428509712,
3
+ "eval_precision": 0.9807795554601095,
4
+ "eval_recall": 0.9879507647046099,
5
+ "eval_f1": 0.9843520993189067,
6
+ "eval_accuracy": 0.9961191832100342,
7
+ "eval_runtime": 4.3687,
8
+ "eval_samples_per_second": 1609.858,
9
+ "eval_steps_per_second": 0.916,
10
+ "epoch": 1.0,
11
+ "num_input_tokens_seen": 17104256.0
12
  }
reports/training_lineage.json CHANGED
@@ -1,81 +1,84 @@
1
  {
2
  "published_checkpoint": "repository_root",
3
- "summary": "The published checkpoint was produced in two stages: a full-dataset CUDA fine-tune on dmhy_weak_char.jsonl, followed by a thin-runtime hard-case focus fine-tune.",
4
- "summary_zh": "当前发布 checkpoint 是两阶段产物:先 dmhy_weak_char.jsonl 全量 CUDA 微调,再做薄层运行时困难样本微调。",
5
  "stages": [
6
  {
7
- "name": "dmhy-char-aug-fragments-optimized-10epoch",
8
- "type": "full_dataset_finetune_with_dynamic_augmentation",
9
  "machine": "adqew@192.168.63.157",
10
  "data_file": "datasets/AnimeName/dmhy_weak_char.jsonl",
 
 
11
  "tokenizer_variant": "char",
12
  "vocab_file": "datasets/AnimeName/vocab.char.json",
13
  "vocab_size": 6199,
14
  "max_seq_length": 128,
15
- "train_samples": 1168468,
16
- "eval_samples": 23847,
 
 
17
  "epochs": 10.0,
 
18
  "batch_size": 1792,
19
- "learning_rate": 0.00002,
20
- "warmup_steps": 500,
21
  "seed": 105,
22
  "device": "cuda",
23
- "bf16": true,
24
- "augmentation": {
25
- "partial_requested": 200000,
26
- "partial_written": 80313,
27
- "permutation_requested": 400000,
28
- "permutation_written": 400000,
29
- "special_requested": 80000,
30
- "special_written": 80000,
31
- "max_chars": 160
 
 
 
32
  },
33
- "eval_f1": 0.9758312981401465,
34
- "eval_accuracy": 0.9934884485890851,
35
- "fixed_regression_model_only": "21/26",
36
- "fixed_regression_normalized_only": "24/26",
37
- "heldout_model_only": "1952/2048",
38
- "heldout_normalized_only": "1976/2048",
39
- "role": "Base checkpoint for the final hard-case focus stage."
 
 
 
 
40
  },
41
  {
42
- "name": "dmhy-char-aug-fragments-10epoch-hardfocus",
43
- "type": "hard_case_focus_finetune",
44
  "machine": "adqew@192.168.63.157",
45
- "data_file": "data/generated/focus_after_10epoch_char.jsonl",
46
  "tokenizer_variant": "char",
47
  "vocab_file": "datasets/AnimeName/vocab.char.json",
48
  "vocab_size": 6199,
49
  "max_seq_length": 128,
50
- "train_samples": 288886,
51
- "eval_samples": 15205,
52
- "epochs": 2.0,
 
53
  "batch_size": 1792,
54
- "learning_rate": 0.000008,
55
- "warmup_steps": 50,
56
- "seed": 107,
57
  "device": "cuda",
58
- "bf16": true,
59
- "augmentation": {
60
- "partial_requested": 30000,
61
- "partial_written": 27959,
62
- "permutation_requested": 60000,
63
- "permutation_written": 60000,
64
- "special_requested": 20000,
65
- "special_written": 20000,
66
- "max_chars": 160
67
- },
68
- "eval_f1": 0.9847215505861748,
69
- "eval_accuracy": 0.9961717168124339,
70
- "fixed_regression_model_only": "25/26",
71
  "fixed_regression_normalized_only": "26/26",
72
- "heldout_model_only": "1947/2048",
73
- "heldout_normalized_only": "1966/2048",
74
- "perf_tokens_per_second_avg": 1070355.0386074905,
75
- "perf_tokens_per_second_max": 1101176.3791377721,
76
- "perf_gpu_util_avg": 78.32592592592593,
77
- "perf_gpu_util_max": 100.0,
78
- "role": "Published repository-root checkpoint."
79
  }
80
  ]
81
  }
 
1
  {
2
  "published_checkpoint": "repository_root",
3
+ "summary": "The published checkpoint was produced in two stages: a full 10-epoch CUDA fine-tune over Rust-generated virtual BIO shards, followed by a light thin-runtime hard-case focus fine-tune.",
4
+ "summary_zh": "当前发布 checkpoint 是两阶段产物:先 Rust 生成的虚拟 BIO shard 完整 10 epoch CUDA 微调,再做轻量薄层运行时困难样本微调。",
5
  "stages": [
6
  {
7
+ "name": "dmhy-char-virtual-sps32-10epoch-lr1e5",
8
+ "type": "full_dataset_finetune_with_rust_virtual_shards",
9
  "machine": "adqew@192.168.63.157",
10
  "data_file": "datasets/AnimeName/dmhy_weak_char.jsonl",
11
+ "virtual_source_file": "data/generated/virtual_source_train_seed105.jsonl",
12
+ "virtual_dataset_dir": "data/generated/virtual_char_sps32_seed105",
13
  "tokenizer_variant": "char",
14
  "vocab_file": "datasets/AnimeName/vocab.char.json",
15
  "vocab_size": 6199,
16
  "max_seq_length": 128,
17
+ "source_rows": 619361,
18
+ "special_fixture_rows": 935,
19
+ "virtual_train_samples": 20439848,
20
+ "eval_samples": 12641,
21
  "epochs": 10.0,
22
+ "optimizer_steps": 114070,
23
  "batch_size": 1792,
24
+ "learning_rate": 0.00001,
25
+ "warmup_steps": 2000,
26
  "seed": 105,
27
  "device": "cuda",
28
+ "mixed_precision": "bf16",
29
+ "tf32": true,
30
+ "dataloader_num_workers": 4,
31
+ "virtual_generation": {
32
+ "samples_per_source": 32,
33
+ "separator_mode": "per-gap",
34
+ "bracket_mode": "per-part",
35
+ "include_original": true,
36
+ "include_special_fixtures": true,
37
+ "shard_size": 25000,
38
+ "shards": 881,
39
+ "elapsed_seconds": 31.55
40
  },
41
+ "eval_f1": 0.9902097153862615,
42
+ "eval_accuracy": 0.9978861640315251,
43
+ "fixed_regression_model_only": "22/26",
44
+ "fixed_regression_normalized_only": "23/26",
45
+ "heldout_model_only": "1994/2048",
46
+ "heldout_normalized_only": "2008/2048",
47
+ "train_runtime_seconds": 21181.32,
48
+ "train_tokens_per_second": 1236288.9470061918,
49
+ "perf_gpu_util_avg": 96.14912280701755,
50
+ "perf_gpu_util_max": 100.0,
51
+ "role": "Base checkpoint for the final light hard-case focus stage. This is the full >100k-step virtual-shard training run."
52
  },
53
  {
54
+ "name": "dmhy-char-virtual-sps32-10epoch-lightfocus",
55
+ "type": "light_hard_case_focus_finetune",
56
  "machine": "adqew@192.168.63.157",
57
+ "data_file": "data/generated/focus_after_virtual_sps32_char.jsonl",
58
  "tokenizer_variant": "char",
59
  "vocab_file": "datasets/AnimeName/vocab.char.json",
60
  "vocab_size": 6199,
61
  "max_seq_length": 128,
62
+ "focus_source_rows": 140660,
63
+ "train_samples": 133627,
64
+ "eval_samples": 7033,
65
+ "epochs": 1.0,
66
  "batch_size": 1792,
67
+ "learning_rate": 0.000002,
68
+ "warmup_steps": 20,
69
+ "seed": 208,
70
  "device": "cuda",
71
+ "mixed_precision": "bf16",
72
+ "tf32": true,
73
+ "eval_f1": 0.9843520993189067,
74
+ "eval_accuracy": 0.9961191832100342,
75
+ "fixed_regression_model_only": "24/26",
 
 
 
 
 
 
 
 
76
  "fixed_regression_normalized_only": "26/26",
77
+ "heldout_model_only": "1962/2048",
78
+ "heldout_normalized_only": "1988/2048",
79
+ "perf_tokens_per_second_avg": 997645.0850819343,
80
+ "perf_gpu_util_avg": 100.0,
81
+ "role": "Published repository-root checkpoint. The default thin runtime also includes narrow postprocessing for bracketed search notes and release-promo title prefixes."
 
 
82
  }
83
  ]
84
  }
tools/virtual_dataset_generator/Cargo.lock ADDED
@@ -0,0 +1,397 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file is automatically @generated by Cargo.
2
+ # It is not intended for manual editing.
3
+ version = 4
4
+
5
+ [[package]]
6
+ name = "anifilebert-virtual-dataset-generator"
7
+ version = "0.1.0"
8
+ dependencies = [
9
+ "anyhow",
10
+ "clap",
11
+ "rand",
12
+ "rayon",
13
+ "serde",
14
+ "serde_json",
15
+ ]
16
+
17
+ [[package]]
18
+ name = "anstream"
19
+ version = "1.0.0"
20
+ source = "registry+https://github.com/rust-lang/crates.io-index"
21
+ checksum = "824a212faf96e9acacdbd09febd34438f8f711fb84e09a8916013cd7815ca28d"
22
+ dependencies = [
23
+ "anstyle",
24
+ "anstyle-parse",
25
+ "anstyle-query",
26
+ "anstyle-wincon",
27
+ "colorchoice",
28
+ "is_terminal_polyfill",
29
+ "utf8parse",
30
+ ]
31
+
32
+ [[package]]
33
+ name = "anstyle"
34
+ version = "1.0.14"
35
+ source = "registry+https://github.com/rust-lang/crates.io-index"
36
+ checksum = "940b3a0ca603d1eade50a4846a2afffd5ef57a9feac2c0e2ec2e14f9ead76000"
37
+
38
+ [[package]]
39
+ name = "anstyle-parse"
40
+ version = "1.0.0"
41
+ source = "registry+https://github.com/rust-lang/crates.io-index"
42
+ checksum = "52ce7f38b242319f7cabaa6813055467063ecdc9d355bbb4ce0c68908cd8130e"
43
+ dependencies = [
44
+ "utf8parse",
45
+ ]
46
+
47
+ [[package]]
48
+ name = "anstyle-query"
49
+ version = "1.1.5"
50
+ source = "registry+https://github.com/rust-lang/crates.io-index"
51
+ checksum = "40c48f72fd53cd289104fc64099abca73db4166ad86ea0b4341abe65af83dadc"
52
+ dependencies = [
53
+ "windows-sys",
54
+ ]
55
+
56
+ [[package]]
57
+ name = "anstyle-wincon"
58
+ version = "3.0.11"
59
+ source = "registry+https://github.com/rust-lang/crates.io-index"
60
+ checksum = "291e6a250ff86cd4a820112fb8898808a366d8f9f58ce16d1f538353ad55747d"
61
+ dependencies = [
62
+ "anstyle",
63
+ "once_cell_polyfill",
64
+ "windows-sys",
65
+ ]
66
+
67
+ [[package]]
68
+ name = "anyhow"
69
+ version = "1.0.102"
70
+ source = "registry+https://github.com/rust-lang/crates.io-index"
71
+ checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c"
72
+
73
+ [[package]]
74
+ name = "cfg-if"
75
+ version = "1.0.4"
76
+ source = "registry+https://github.com/rust-lang/crates.io-index"
77
+ checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
78
+
79
+ [[package]]
80
+ name = "clap"
81
+ version = "4.6.1"
82
+ source = "registry+https://github.com/rust-lang/crates.io-index"
83
+ checksum = "1ddb117e43bbf7dacf0a4190fef4d345b9bad68dfc649cb349e7d17d28428e51"
84
+ dependencies = [
85
+ "clap_builder",
86
+ "clap_derive",
87
+ ]
88
+
89
+ [[package]]
90
+ name = "clap_builder"
91
+ version = "4.6.0"
92
+ source = "registry+https://github.com/rust-lang/crates.io-index"
93
+ checksum = "714a53001bf66416adb0e2ef5ac857140e7dc3a0c48fb28b2f10762fc4b5069f"
94
+ dependencies = [
95
+ "anstream",
96
+ "anstyle",
97
+ "clap_lex",
98
+ "strsim",
99
+ ]
100
+
101
+ [[package]]
102
+ name = "clap_derive"
103
+ version = "4.6.1"
104
+ source = "registry+https://github.com/rust-lang/crates.io-index"
105
+ checksum = "f2ce8604710f6733aa641a2b3731eaa1e8b3d9973d5e3565da11800813f997a9"
106
+ dependencies = [
107
+ "heck",
108
+ "proc-macro2",
109
+ "quote",
110
+ "syn",
111
+ ]
112
+
113
+ [[package]]
114
+ name = "clap_lex"
115
+ version = "1.1.0"
116
+ source = "registry+https://github.com/rust-lang/crates.io-index"
117
+ checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
118
+
119
+ [[package]]
120
+ name = "colorchoice"
121
+ version = "1.0.5"
122
+ source = "registry+https://github.com/rust-lang/crates.io-index"
123
+ checksum = "1d07550c9036bf2ae0c684c4297d503f838287c83c53686d05370d0e139ae570"
124
+
125
+ [[package]]
126
+ name = "crossbeam-deque"
127
+ version = "0.8.6"
128
+ source = "registry+https://github.com/rust-lang/crates.io-index"
129
+ checksum = "9dd111b7b7f7d55b72c0a6ae361660ee5853c9af73f70c3c2ef6858b950e2e51"
130
+ dependencies = [
131
+ "crossbeam-epoch",
132
+ "crossbeam-utils",
133
+ ]
134
+
135
+ [[package]]
136
+ name = "crossbeam-epoch"
137
+ version = "0.9.18"
138
+ source = "registry+https://github.com/rust-lang/crates.io-index"
139
+ checksum = "5b82ac4a3c2ca9c3460964f020e1402edd5753411d7737aa39c3714ad1b5420e"
140
+ dependencies = [
141
+ "crossbeam-utils",
142
+ ]
143
+
144
+ [[package]]
145
+ name = "crossbeam-utils"
146
+ version = "0.8.21"
147
+ source = "registry+https://github.com/rust-lang/crates.io-index"
148
+ checksum = "d0a5c400df2834b80a4c3327b3aad3a4c4cd4de0629063962b03235697506a28"
149
+
150
+ [[package]]
151
+ name = "either"
152
+ version = "1.16.0"
153
+ source = "registry+https://github.com/rust-lang/crates.io-index"
154
+ checksum = "91622ff5e7162018101f2fea40d6ebf4a78bbe5a49736a2020649edf9693679e"
155
+
156
+ [[package]]
157
+ name = "getrandom"
158
+ version = "0.2.17"
159
+ source = "registry+https://github.com/rust-lang/crates.io-index"
160
+ checksum = "ff2abc00be7fca6ebc474524697ae276ad847ad0a6b3faa4bcb027e9a4614ad0"
161
+ dependencies = [
162
+ "cfg-if",
163
+ "libc",
164
+ "wasi",
165
+ ]
166
+
167
+ [[package]]
168
+ name = "heck"
169
+ version = "0.5.0"
170
+ source = "registry+https://github.com/rust-lang/crates.io-index"
171
+ checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea"
172
+
173
+ [[package]]
174
+ name = "is_terminal_polyfill"
175
+ version = "1.70.2"
176
+ source = "registry+https://github.com/rust-lang/crates.io-index"
177
+ checksum = "a6cb138bb79a146c1bd460005623e142ef0181e3d0219cb493e02f7d08a35695"
178
+
179
+ [[package]]
180
+ name = "itoa"
181
+ version = "1.0.18"
182
+ source = "registry+https://github.com/rust-lang/crates.io-index"
183
+ checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682"
184
+
185
+ [[package]]
186
+ name = "libc"
187
+ version = "0.2.186"
188
+ source = "registry+https://github.com/rust-lang/crates.io-index"
189
+ checksum = "68ab91017fe16c622486840e4c83c9a37afeff978bd239b5293d61ece587de66"
190
+
191
+ [[package]]
192
+ name = "memchr"
193
+ version = "2.8.0"
194
+ source = "registry+https://github.com/rust-lang/crates.io-index"
195
+ checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79"
196
+
197
+ [[package]]
198
+ name = "once_cell_polyfill"
199
+ version = "1.70.2"
200
+ source = "registry+https://github.com/rust-lang/crates.io-index"
201
+ checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe"
202
+
203
+ [[package]]
204
+ name = "ppv-lite86"
205
+ version = "0.2.21"
206
+ source = "registry+https://github.com/rust-lang/crates.io-index"
207
+ checksum = "85eae3c4ed2f50dcfe72643da4befc30deadb458a9b590d720cde2f2b1e97da9"
208
+ dependencies = [
209
+ "zerocopy",
210
+ ]
211
+
212
+ [[package]]
213
+ name = "proc-macro2"
214
+ version = "1.0.106"
215
+ source = "registry+https://github.com/rust-lang/crates.io-index"
216
+ checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934"
217
+ dependencies = [
218
+ "unicode-ident",
219
+ ]
220
+
221
+ [[package]]
222
+ name = "quote"
223
+ version = "1.0.45"
224
+ source = "registry+https://github.com/rust-lang/crates.io-index"
225
+ checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924"
226
+ dependencies = [
227
+ "proc-macro2",
228
+ ]
229
+
230
+ [[package]]
231
+ name = "rand"
232
+ version = "0.8.6"
233
+ source = "registry+https://github.com/rust-lang/crates.io-index"
234
+ checksum = "5ca0ecfa931c29007047d1bc58e623ab12e5590e8c7cc53200d5202b69266d8a"
235
+ dependencies = [
236
+ "libc",
237
+ "rand_chacha",
238
+ "rand_core",
239
+ ]
240
+
241
+ [[package]]
242
+ name = "rand_chacha"
243
+ version = "0.3.1"
244
+ source = "registry+https://github.com/rust-lang/crates.io-index"
245
+ checksum = "e6c10a63a0fa32252be49d21e7709d4d4baf8d231c2dbce1eaa8141b9b127d88"
246
+ dependencies = [
247
+ "ppv-lite86",
248
+ "rand_core",
249
+ ]
250
+
251
+ [[package]]
252
+ name = "rand_core"
253
+ version = "0.6.4"
254
+ source = "registry+https://github.com/rust-lang/crates.io-index"
255
+ checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c"
256
+ dependencies = [
257
+ "getrandom",
258
+ ]
259
+
260
+ [[package]]
261
+ name = "rayon"
262
+ version = "1.12.0"
263
+ source = "registry+https://github.com/rust-lang/crates.io-index"
264
+ checksum = "fb39b166781f92d482534ef4b4b1b2568f42613b53e5b6c160e24cfbfa30926d"
265
+ dependencies = [
266
+ "either",
267
+ "rayon-core",
268
+ ]
269
+
270
+ [[package]]
271
+ name = "rayon-core"
272
+ version = "1.13.0"
273
+ source = "registry+https://github.com/rust-lang/crates.io-index"
274
+ checksum = "22e18b0f0062d30d4230b2e85ff77fdfe4326feb054b9783a3460d8435c8ab91"
275
+ dependencies = [
276
+ "crossbeam-deque",
277
+ "crossbeam-utils",
278
+ ]
279
+
280
+ [[package]]
281
+ name = "serde"
282
+ version = "1.0.228"
283
+ source = "registry+https://github.com/rust-lang/crates.io-index"
284
+ checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
285
+ dependencies = [
286
+ "serde_core",
287
+ "serde_derive",
288
+ ]
289
+
290
+ [[package]]
291
+ name = "serde_core"
292
+ version = "1.0.228"
293
+ source = "registry+https://github.com/rust-lang/crates.io-index"
294
+ checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
295
+ dependencies = [
296
+ "serde_derive",
297
+ ]
298
+
299
+ [[package]]
300
+ name = "serde_derive"
301
+ version = "1.0.228"
302
+ source = "registry+https://github.com/rust-lang/crates.io-index"
303
+ checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
304
+ dependencies = [
305
+ "proc-macro2",
306
+ "quote",
307
+ "syn",
308
+ ]
309
+
310
+ [[package]]
311
+ name = "serde_json"
312
+ version = "1.0.150"
313
+ source = "registry+https://github.com/rust-lang/crates.io-index"
314
+ checksum = "e8014e44b4736ed0538adeecded0fce2a272f22dc9578a7eb6b2d9993c74cfb9"
315
+ dependencies = [
316
+ "itoa",
317
+ "memchr",
318
+ "serde",
319
+ "serde_core",
320
+ "zmij",
321
+ ]
322
+
323
+ [[package]]
324
+ name = "strsim"
325
+ version = "0.11.1"
326
+ source = "registry+https://github.com/rust-lang/crates.io-index"
327
+ checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
328
+
329
+ [[package]]
330
+ name = "syn"
331
+ version = "2.0.117"
332
+ source = "registry+https://github.com/rust-lang/crates.io-index"
333
+ checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99"
334
+ dependencies = [
335
+ "proc-macro2",
336
+ "quote",
337
+ "unicode-ident",
338
+ ]
339
+
340
+ [[package]]
341
+ name = "unicode-ident"
342
+ version = "1.0.24"
343
+ source = "registry+https://github.com/rust-lang/crates.io-index"
344
+ checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75"
345
+
346
+ [[package]]
347
+ name = "utf8parse"
348
+ version = "0.2.2"
349
+ source = "registry+https://github.com/rust-lang/crates.io-index"
350
+ checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
351
+
352
+ [[package]]
353
+ name = "wasi"
354
+ version = "0.11.1+wasi-snapshot-preview1"
355
+ source = "registry+https://github.com/rust-lang/crates.io-index"
356
+ checksum = "ccf3ec651a847eb01de73ccad15eb7d99f80485de043efb2f370cd654f4ea44b"
357
+
358
+ [[package]]
359
+ name = "windows-link"
360
+ version = "0.2.1"
361
+ source = "registry+https://github.com/rust-lang/crates.io-index"
362
+ checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5"
363
+
364
+ [[package]]
365
+ name = "windows-sys"
366
+ version = "0.61.2"
367
+ source = "registry+https://github.com/rust-lang/crates.io-index"
368
+ checksum = "ae137229bcbd6cdf0f7b80a31df61766145077ddf49416a728b02cb3921ff3fc"
369
+ dependencies = [
370
+ "windows-link",
371
+ ]
372
+
373
+ [[package]]
374
+ name = "zerocopy"
375
+ version = "0.8.48"
376
+ source = "registry+https://github.com/rust-lang/crates.io-index"
377
+ checksum = "eed437bf9d6692032087e337407a86f04cd8d6a16a37199ed57949d415bd68e9"
378
+ dependencies = [
379
+ "zerocopy-derive",
380
+ ]
381
+
382
+ [[package]]
383
+ name = "zerocopy-derive"
384
+ version = "0.8.48"
385
+ source = "registry+https://github.com/rust-lang/crates.io-index"
386
+ checksum = "70e3cd084b1788766f53af483dd21f93881ff30d7320490ec3ef7526d203bad4"
387
+ dependencies = [
388
+ "proc-macro2",
389
+ "quote",
390
+ "syn",
391
+ ]
392
+
393
+ [[package]]
394
+ name = "zmij"
395
+ version = "1.0.21"
396
+ source = "registry+https://github.com/rust-lang/crates.io-index"
397
+ checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"
tools/virtual_dataset_generator/Cargo.toml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [package]
2
+ name = "anifilebert-virtual-dataset-generator"
3
+ version = "0.1.0"
4
+ edition = "2021"
5
+
6
+ [dependencies]
7
+ anyhow = "1.0"
8
+ clap = { version = "4.5", features = ["derive"] }
9
+ rand = "0.8"
10
+ rayon = "1.10"
11
+ serde = { version = "1.0", features = ["derive"] }
12
+ serde_json = "1.0"
tools/virtual_dataset_generator/src/main.rs ADDED
@@ -0,0 +1,1390 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ use anyhow::{bail, Context, Result};
2
+ use clap::{Parser, ValueEnum};
3
+ use rand::rngs::StdRng;
4
+ use rand::seq::SliceRandom;
5
+ use rand::Rng;
6
+ use rand::SeedableRng;
7
+ use rayon::prelude::*;
8
+ use serde::{Deserialize, Serialize};
9
+ use serde_json::json;
10
+ use std::collections::{HashMap, HashSet};
11
+ use std::fs::{self, File};
12
+ use std::io::{BufRead, BufReader, BufWriter, Write};
13
+ use std::path::{Path, PathBuf};
14
+ use std::time::Instant;
15
+
16
+ const ENTITIES: [Entity; 7] = [
17
+ Entity::Group,
18
+ Entity::Title,
19
+ Entity::Season,
20
+ Entity::Episode,
21
+ Entity::Special,
22
+ Entity::Resolution,
23
+ Entity::Source,
24
+ ];
25
+
26
+ #[derive(Parser, Debug)]
27
+ #[command(
28
+ about = "Generate pre-encoded AniFileBERT virtual BIO permutation shards",
29
+ version
30
+ )]
31
+ struct Args {
32
+ #[arg(long)]
33
+ input: PathBuf,
34
+
35
+ #[arg(long)]
36
+ vocab_file: PathBuf,
37
+
38
+ #[arg(long)]
39
+ output_dir: PathBuf,
40
+
41
+ #[arg(long, default_value_t = 128)]
42
+ max_length: usize,
43
+
44
+ #[arg(long, default_value_t = 25_000)]
45
+ shard_size: usize,
46
+
47
+ #[arg(long, default_value_t = 0)]
48
+ limit_rows: usize,
49
+
50
+ #[arg(long, default_value_t = 0)]
51
+ samples_per_source: usize,
52
+
53
+ #[arg(long, default_value_t = 42)]
54
+ seed: u64,
55
+
56
+ #[arg(long, default_value_t = 0)]
57
+ threads: usize,
58
+
59
+ #[arg(long, default_value = "global")]
60
+ separator_mode: SeparatorMode,
61
+
62
+ #[arg(long, default_value = "global")]
63
+ bracket_mode: BracketMode,
64
+
65
+ #[arg(long, value_delimiter = ',', default_value = " , - ,.,_,-,~,~")]
66
+ separators: Vec<String>,
67
+
68
+ #[arg(
69
+ long,
70
+ value_delimiter = ',',
71
+ default_value = "none,square,round,corner,angle"
72
+ )]
73
+ bracket_styles: Vec<String>,
74
+
75
+ #[arg(long, default_value_t = true)]
76
+ include_original: bool,
77
+
78
+ #[arg(long, default_value_t = true)]
79
+ include_special_fixtures: bool,
80
+
81
+ #[arg(long, help = "Only count rows; do not write shard files")]
82
+ dry_run: bool,
83
+ }
84
+
85
+ #[derive(Clone, Copy, Debug, Serialize, ValueEnum)]
86
+ enum SeparatorMode {
87
+ Global,
88
+ PerGap,
89
+ }
90
+
91
+ #[derive(Clone, Copy, Debug, Serialize, ValueEnum)]
92
+ enum BracketMode {
93
+ Global,
94
+ PerPart,
95
+ }
96
+
97
+ #[derive(Clone, Copy, Debug, Eq, PartialEq, Hash, Ord, PartialOrd, Serialize)]
98
+ enum Entity {
99
+ Group,
100
+ Title,
101
+ Season,
102
+ Episode,
103
+ Special,
104
+ Resolution,
105
+ Source,
106
+ }
107
+
108
+ impl Entity {
109
+ fn index(self) -> usize {
110
+ match self {
111
+ Entity::Group => 0,
112
+ Entity::Title => 1,
113
+ Entity::Season => 2,
114
+ Entity::Episode => 3,
115
+ Entity::Special => 4,
116
+ Entity::Resolution => 5,
117
+ Entity::Source => 6,
118
+ }
119
+ }
120
+
121
+ fn from_name(name: &str) -> Option<Self> {
122
+ match name {
123
+ "GROUP" => Some(Entity::Group),
124
+ "TITLE" => Some(Entity::Title),
125
+ "SEASON" => Some(Entity::Season),
126
+ "EPISODE" => Some(Entity::Episode),
127
+ "SPECIAL" => Some(Entity::Special),
128
+ "RESOLUTION" => Some(Entity::Resolution),
129
+ "SOURCE" => Some(Entity::Source),
130
+ _ => None,
131
+ }
132
+ }
133
+
134
+ fn b_label(self) -> &'static str {
135
+ match self {
136
+ Entity::Group => "B-GROUP",
137
+ Entity::Title => "B-TITLE",
138
+ Entity::Season => "B-SEASON",
139
+ Entity::Episode => "B-EPISODE",
140
+ Entity::Special => "B-SPECIAL",
141
+ Entity::Resolution => "B-RESOLUTION",
142
+ Entity::Source => "B-SOURCE",
143
+ }
144
+ }
145
+
146
+ fn i_label(self) -> &'static str {
147
+ match self {
148
+ Entity::Group => "I-GROUP",
149
+ Entity::Title => "I-TITLE",
150
+ Entity::Season => "I-SEASON",
151
+ Entity::Episode => "I-EPISODE",
152
+ Entity::Special => "I-SPECIAL",
153
+ Entity::Resolution => "I-RESOLUTION",
154
+ Entity::Source => "I-SOURCE",
155
+ }
156
+ }
157
+ }
158
+
159
+ #[derive(Clone, Debug)]
160
+ struct Bracket {
161
+ name: String,
162
+ open: String,
163
+ close: String,
164
+ }
165
+
166
+ impl Bracket {
167
+ fn from_name(name: &str) -> Result<Self> {
168
+ let trimmed = name.trim();
169
+ let pair = match trimmed {
170
+ "none" => ("", ""),
171
+ "square" => ("[", "]"),
172
+ "round" => ("(", ")"),
173
+ "corner" => ("【", "】"),
174
+ "angle" => ("《", "》"),
175
+ custom if custom.contains('|') => {
176
+ let mut parts = custom.splitn(2, '|');
177
+ let open = parts.next().unwrap_or_default();
178
+ let close = parts.next().unwrap_or_default();
179
+ return Ok(Self {
180
+ name: custom.to_string(),
181
+ open: open.to_string(),
182
+ close: close.to_string(),
183
+ });
184
+ }
185
+ other => bail!("unknown bracket style '{other}'"),
186
+ };
187
+ Ok(Self {
188
+ name: trimmed.to_string(),
189
+ open: pair.0.to_string(),
190
+ close: pair.1.to_string(),
191
+ })
192
+ }
193
+ }
194
+
195
+ #[derive(Deserialize)]
196
+ struct InputRow {
197
+ filename: Option<String>,
198
+ tokens: Vec<String>,
199
+ labels: Vec<String>,
200
+ tokenizer_variant: Option<String>,
201
+ }
202
+
203
+ #[derive(Clone)]
204
+ struct SourceSample {
205
+ row_index: usize,
206
+ filename: String,
207
+ tokens: Vec<String>,
208
+ labels: Vec<String>,
209
+ fields: Vec<Vec<String>>,
210
+ }
211
+
212
+ #[derive(Clone)]
213
+ struct GenConfig {
214
+ max_length: usize,
215
+ shard_size: usize,
216
+ separator_mode: SeparatorMode,
217
+ bracket_mode: BracketMode,
218
+ separators: Vec<String>,
219
+ brackets: Vec<Bracket>,
220
+ include_original: bool,
221
+ samples_per_source: usize,
222
+ seed: u64,
223
+ }
224
+
225
+ #[derive(Clone)]
226
+ struct Vocab {
227
+ ids: HashMap<String, u16>,
228
+ pad_id: u16,
229
+ unk_id: u16,
230
+ cls_id: u16,
231
+ sep_id: u16,
232
+ }
233
+
234
+ #[derive(Serialize)]
235
+ struct ShardManifest {
236
+ rows: usize,
237
+ input_ids: String,
238
+ attention_mask: String,
239
+ labels: String,
240
+ }
241
+
242
+ struct ShardWriter {
243
+ output_dir: PathBuf,
244
+ worker_id: usize,
245
+ shard_seq: usize,
246
+ shard_size: usize,
247
+ max_length: usize,
248
+ input_ids: Vec<u16>,
249
+ attention_mask: Vec<u8>,
250
+ labels: Vec<i16>,
251
+ rows: usize,
252
+ total_rows: u64,
253
+ shards: Vec<ShardManifest>,
254
+ }
255
+
256
+ impl ShardWriter {
257
+ fn new(output_dir: &Path, worker_id: usize, shard_size: usize, max_length: usize) -> Self {
258
+ let capacity = shard_size.saturating_mul(max_length);
259
+ Self {
260
+ output_dir: output_dir.to_path_buf(),
261
+ worker_id,
262
+ shard_seq: 0,
263
+ shard_size,
264
+ max_length,
265
+ input_ids: Vec::with_capacity(capacity),
266
+ attention_mask: Vec::with_capacity(capacity),
267
+ labels: Vec::with_capacity(capacity),
268
+ rows: 0,
269
+ total_rows: 0,
270
+ shards: Vec::new(),
271
+ }
272
+ }
273
+
274
+ fn add(&mut self, input_ids: &[u16], attention_mask: &[u8], labels: &[i16]) -> Result<()> {
275
+ if input_ids.len() != self.max_length
276
+ || attention_mask.len() != self.max_length
277
+ || labels.len() != self.max_length
278
+ {
279
+ bail!("encoded sample has wrong shape");
280
+ }
281
+ self.input_ids.extend_from_slice(input_ids);
282
+ self.attention_mask.extend_from_slice(attention_mask);
283
+ self.labels.extend_from_slice(labels);
284
+ self.rows += 1;
285
+ self.total_rows += 1;
286
+ if self.rows >= self.shard_size {
287
+ self.flush()?;
288
+ }
289
+ Ok(())
290
+ }
291
+
292
+ fn flush(&mut self) -> Result<()> {
293
+ if self.rows == 0 {
294
+ return Ok(());
295
+ }
296
+
297
+ let base = format!("part-w{:03}-s{:06}", self.worker_id, self.shard_seq);
298
+ let input_name = format!("{base}.input_ids.npy");
299
+ let mask_name = format!("{base}.attention_mask.npy");
300
+ let label_name = format!("{base}.labels.npy");
301
+ write_npy_u16(
302
+ &self.output_dir.join(&input_name),
303
+ &self.input_ids,
304
+ self.rows,
305
+ self.max_length,
306
+ )?;
307
+ write_npy_u8(
308
+ &self.output_dir.join(&mask_name),
309
+ &self.attention_mask,
310
+ self.rows,
311
+ self.max_length,
312
+ )?;
313
+ write_npy_i16(
314
+ &self.output_dir.join(&label_name),
315
+ &self.labels,
316
+ self.rows,
317
+ self.max_length,
318
+ )?;
319
+ self.shards.push(ShardManifest {
320
+ rows: self.rows,
321
+ input_ids: input_name,
322
+ attention_mask: mask_name,
323
+ labels: label_name,
324
+ });
325
+ self.input_ids.clear();
326
+ self.attention_mask.clear();
327
+ self.labels.clear();
328
+ self.rows = 0;
329
+ self.shard_seq += 1;
330
+ Ok(())
331
+ }
332
+ }
333
+
334
+ fn main() -> Result<()> {
335
+ let args = Args::parse();
336
+ if args.max_length < 4 {
337
+ bail!("--max-length must be at least 4");
338
+ }
339
+ if args.shard_size == 0 {
340
+ bail!("--shard-size must be positive");
341
+ }
342
+ if args.threads > 0 {
343
+ rayon::ThreadPoolBuilder::new()
344
+ .num_threads(args.threads)
345
+ .build_global()
346
+ .context("failed to configure rayon thread pool")?;
347
+ }
348
+
349
+ let started = Instant::now();
350
+ let vocab = load_vocab(&args.vocab_file)?;
351
+ let brackets = args
352
+ .bracket_styles
353
+ .iter()
354
+ .map(|style| Bracket::from_name(style))
355
+ .collect::<Result<Vec<_>>>()?;
356
+ let separators = args
357
+ .separators
358
+ .iter()
359
+ .map(|sep| normalize_separator_arg(sep))
360
+ .collect::<Vec<_>>();
361
+ let cfg = GenConfig {
362
+ max_length: args.max_length,
363
+ shard_size: args.shard_size,
364
+ separator_mode: args.separator_mode,
365
+ bracket_mode: args.bracket_mode,
366
+ separators,
367
+ brackets,
368
+ include_original: args.include_original,
369
+ samples_per_source: args.samples_per_source,
370
+ seed: args.seed,
371
+ };
372
+
373
+ let mut samples = load_samples(&args.input, args.limit_rows)?;
374
+ let source_rows = samples.len();
375
+ let mut rng = StdRng::seed_from_u64(args.seed);
376
+ samples.shuffle(&mut rng);
377
+
378
+ if args.dry_run {
379
+ let generated: u128 = samples
380
+ .par_iter()
381
+ .map(|sample| count_variants(sample, &cfg))
382
+ .sum();
383
+ let special_fixtures = if args.include_special_fixtures {
384
+ count_special_fixtures(&cfg) as u128
385
+ } else {
386
+ 0
387
+ };
388
+ let manifest = json!({
389
+ "format": "anifilebert.virtual_dataset.preview.v1",
390
+ "input": args.input,
391
+ "vocab_file": args.vocab_file,
392
+ "source_rows": source_rows,
393
+ "estimated_rows": generated + special_fixtures,
394
+ "source_variant_rows": generated,
395
+ "special_fixture_rows": special_fixtures,
396
+ "max_length": cfg.max_length,
397
+ "separator_mode": cfg.separator_mode,
398
+ "bracket_mode": cfg.bracket_mode,
399
+ "separators": cfg.separators,
400
+ "brackets": cfg.brackets.iter().map(|b| &b.name).collect::<Vec<_>>(),
401
+ "include_original": cfg.include_original,
402
+ "samples_per_source": cfg.samples_per_source,
403
+ "include_special_fixtures": args.include_special_fixtures,
404
+ "seed": args.seed,
405
+ "elapsed_seconds": started.elapsed().as_secs_f64(),
406
+ });
407
+ println!("{}", serde_json::to_string_pretty(&manifest)?);
408
+ return Ok(());
409
+ }
410
+
411
+ fs::create_dir_all(&args.output_dir).with_context(|| {
412
+ format!(
413
+ "failed to create output directory {}",
414
+ args.output_dir.display()
415
+ )
416
+ })?;
417
+
418
+ let chunk_count = rayon::current_num_threads().max(1) * 4;
419
+ let chunk_size = samples.len().div_ceil(chunk_count).max(1);
420
+ let chunks = samples
421
+ .chunks(chunk_size)
422
+ .enumerate()
423
+ .collect::<Vec<(usize, &[SourceSample])>>();
424
+
425
+ let mut worker_results = chunks
426
+ .par_iter()
427
+ .map(|(chunk_idx, chunk)| {
428
+ let mut writer =
429
+ ShardWriter::new(&args.output_dir, *chunk_idx, cfg.shard_size, cfg.max_length);
430
+ for sample in *chunk {
431
+ generate_for_sample(sample, &cfg, &vocab, &mut writer)?;
432
+ }
433
+ writer.flush()?;
434
+ Ok::<_, anyhow::Error>((writer.total_rows, writer.shards))
435
+ })
436
+ .collect::<Result<Vec<_>>>()?;
437
+
438
+ let mut total_rows: u64 = 0;
439
+ let mut shards: Vec<ShardManifest> = Vec::new();
440
+ for (rows, mut worker_shards) in worker_results.drain(..) {
441
+ total_rows += rows;
442
+ shards.append(&mut worker_shards);
443
+ }
444
+
445
+ let special_rows = if args.include_special_fixtures {
446
+ let mut writer = ShardWriter::new(
447
+ &args.output_dir,
448
+ chunk_count + 1,
449
+ cfg.shard_size,
450
+ cfg.max_length,
451
+ );
452
+ for special in built_in_specials() {
453
+ let parts = vec![PartChoice {
454
+ entity: Entity::Special,
455
+ value: special,
456
+ }];
457
+ emit_syntax_variants(&parts, &cfg, &vocab, &mut writer)?;
458
+ }
459
+ writer.flush()?;
460
+ total_rows += writer.total_rows;
461
+ shards.append(&mut writer.shards);
462
+ writer.total_rows
463
+ } else {
464
+ 0
465
+ };
466
+
467
+ shards.sort_by(|a, b| a.input_ids.cmp(&b.input_ids));
468
+ let manifest = json!({
469
+ "format": "anifilebert.virtual_dataset.shards.v1",
470
+ "input": args.input,
471
+ "vocab_file": args.vocab_file,
472
+ "source_rows": source_rows,
473
+ "total_rows": total_rows,
474
+ "special_fixture_rows": special_rows,
475
+ "max_length": cfg.max_length,
476
+ "shard_size": cfg.shard_size,
477
+ "tokenizer_variant": "char",
478
+ "encoding": {
479
+ "input_ids_dtype": "uint16",
480
+ "attention_mask_dtype": "uint8",
481
+ "labels_dtype": "int16",
482
+ "layout": "row_major_npy"
483
+ },
484
+ "special_tokens": {
485
+ "pad_id": vocab.pad_id,
486
+ "unk_id": vocab.unk_id,
487
+ "cls_id": vocab.cls_id,
488
+ "sep_id": vocab.sep_id
489
+ },
490
+ "generation": {
491
+ "separator_mode": cfg.separator_mode,
492
+ "bracket_mode": cfg.bracket_mode,
493
+ "separators": cfg.separators,
494
+ "brackets": cfg.brackets.iter().map(|b| &b.name).collect::<Vec<_>>(),
495
+ "include_original": cfg.include_original,
496
+ "samples_per_source": cfg.samples_per_source,
497
+ "include_special_fixtures": args.include_special_fixtures,
498
+ "seed": args.seed,
499
+ "threads": rayon::current_num_threads()
500
+ },
501
+ "shards": shards,
502
+ "elapsed_seconds": started.elapsed().as_secs_f64(),
503
+ });
504
+ let manifest_path = args.output_dir.join("manifest.json");
505
+ fs::write(&manifest_path, serde_json::to_string_pretty(&manifest)?)
506
+ .with_context(|| format!("failed to write {}", manifest_path.display()))?;
507
+ println!("{}", serde_json::to_string_pretty(&manifest)?);
508
+ Ok(())
509
+ }
510
+
511
+ fn normalize_separator_arg(value: &str) -> String {
512
+ match value {
513
+ "\\t" => "\t".to_string(),
514
+ "\\s" => " ".to_string(),
515
+ other => other.to_string(),
516
+ }
517
+ }
518
+
519
+ fn load_vocab(path: &Path) -> Result<Vocab> {
520
+ let text = fs::read_to_string(path)
521
+ .with_context(|| format!("failed to read vocab file {}", path.display()))?;
522
+ let raw: HashMap<String, u64> =
523
+ serde_json::from_str(&text).context("failed to parse vocab JSON")?;
524
+ let mut ids = HashMap::with_capacity(raw.len());
525
+ for (token, id) in raw {
526
+ if id > u16::MAX as u64 {
527
+ bail!("vocab id {id} for token '{token}' exceeds uint16 storage");
528
+ }
529
+ ids.insert(token, id as u16);
530
+ }
531
+ let pad_id = *ids.get("[PAD]").context("vocab is missing [PAD]")?;
532
+ let unk_id = *ids.get("[UNK]").context("vocab is missing [UNK]")?;
533
+ let cls_id = *ids.get("[CLS]").context("vocab is missing [CLS]")?;
534
+ let sep_id = *ids.get("[SEP]").context("vocab is missing [SEP]")?;
535
+ Ok(Vocab {
536
+ ids,
537
+ pad_id,
538
+ unk_id,
539
+ cls_id,
540
+ sep_id,
541
+ })
542
+ }
543
+
544
+ fn load_samples(path: &Path, limit_rows: usize) -> Result<Vec<SourceSample>> {
545
+ let file = File::open(path).with_context(|| format!("failed to open {}", path.display()))?;
546
+ let reader = BufReader::new(file);
547
+ let mut samples = Vec::new();
548
+ for (idx, line) in reader.lines().enumerate() {
549
+ if limit_rows > 0 && samples.len() >= limit_rows {
550
+ break;
551
+ }
552
+ let line = line.with_context(|| format!("failed reading line {}", idx + 1))?;
553
+ if line.trim().is_empty() {
554
+ continue;
555
+ }
556
+ let row: InputRow = serde_json::from_str(&line)
557
+ .with_context(|| format!("failed to parse JSONL line {}", idx + 1))?;
558
+ if let Some(variant) = row.tokenizer_variant.as_deref() {
559
+ if variant != "char" {
560
+ bail!(
561
+ "line {} has tokenizer_variant={variant}; virtual shard generation currently requires char data",
562
+ idx + 1
563
+ );
564
+ }
565
+ }
566
+ if row.tokens.len() != row.labels.len() {
567
+ bail!(
568
+ "line {} has mismatched token/label lengths: {} vs {}",
569
+ idx + 1,
570
+ row.tokens.len(),
571
+ row.labels.len()
572
+ );
573
+ }
574
+ let filename = row.filename.clone().unwrap_or_else(|| row.tokens.join(""));
575
+ let fields = extract_fields(&row.tokens, &row.labels);
576
+ samples.push(SourceSample {
577
+ row_index: idx,
578
+ filename,
579
+ tokens: row.tokens,
580
+ labels: row.labels,
581
+ fields,
582
+ });
583
+ }
584
+ Ok(samples)
585
+ }
586
+
587
+ fn extract_fields(tokens: &[String], labels: &[String]) -> Vec<Vec<String>> {
588
+ let mut fields: Vec<Vec<String>> = (0..ENTITIES.len()).map(|_| Vec::new()).collect();
589
+ let mut seen: Vec<HashSet<String>> = (0..ENTITIES.len()).map(|_| HashSet::new()).collect();
590
+ let mut active_entity: Option<Entity> = None;
591
+ let mut active_text = String::new();
592
+
593
+ let flush = |entity: Option<Entity>,
594
+ text: &mut String,
595
+ fields: &mut Vec<Vec<String>>,
596
+ seen: &mut Vec<HashSet<String>>| {
597
+ if let Some(entity) = entity {
598
+ let value = text.trim().to_string();
599
+ if !value.is_empty() && seen[entity.index()].insert(value.clone()) {
600
+ fields[entity.index()].push(value);
601
+ }
602
+ }
603
+ text.clear();
604
+ };
605
+
606
+ for (token, label) in tokens.iter().zip(labels.iter()) {
607
+ if let Some(entity) = label.strip_prefix("B-").and_then(Entity::from_name) {
608
+ flush(active_entity, &mut active_text, &mut fields, &mut seen);
609
+ active_entity = Some(entity);
610
+ active_text.push_str(token);
611
+ } else if let Some(entity) = label.strip_prefix("I-").and_then(Entity::from_name) {
612
+ if active_entity == Some(entity) {
613
+ active_text.push_str(token);
614
+ } else {
615
+ flush(active_entity, &mut active_text, &mut fields, &mut seen);
616
+ active_entity = Some(entity);
617
+ active_text.push_str(token);
618
+ }
619
+ } else {
620
+ flush(active_entity, &mut active_text, &mut fields, &mut seen);
621
+ active_entity = None;
622
+ }
623
+ }
624
+ flush(active_entity, &mut active_text, &mut fields, &mut seen);
625
+ fields
626
+ }
627
+
628
+ fn count_variants(sample: &SourceSample, cfg: &GenConfig) -> u128 {
629
+ let mut count = if cfg.include_original { 1 } else { 0 };
630
+ let available = ENTITIES
631
+ .iter()
632
+ .copied()
633
+ .filter(|entity| !sample.fields[entity.index()].is_empty())
634
+ .collect::<Vec<_>>();
635
+ let n = available.len();
636
+ if n == 0 {
637
+ return count;
638
+ }
639
+ if cfg.samples_per_source > 0 {
640
+ return count + cfg.samples_per_source as u128;
641
+ }
642
+ for mask in 1usize..(1usize << n) {
643
+ let selected = available
644
+ .iter()
645
+ .enumerate()
646
+ .filter_map(|(idx, entity)| ((mask & (1usize << idx)) != 0).then_some(*entity))
647
+ .collect::<Vec<_>>();
648
+ let m = selected.len();
649
+ let value_product: u128 = selected
650
+ .iter()
651
+ .map(|entity| sample.fields[entity.index()].len() as u128)
652
+ .product();
653
+ let perm_count = factorial(m as u32);
654
+ let sep_factor = if m <= 1 {
655
+ 1
656
+ } else {
657
+ match cfg.separator_mode {
658
+ SeparatorMode::Global => cfg.separators.len() as u128,
659
+ SeparatorMode::PerGap => (cfg.separators.len() as u128).pow((m - 1) as u32),
660
+ }
661
+ };
662
+ let bracket_factor = match cfg.bracket_mode {
663
+ BracketMode::Global => cfg.brackets.len() as u128,
664
+ BracketMode::PerPart => (cfg.brackets.len() as u128).pow(m as u32),
665
+ };
666
+ count += value_product * perm_count * sep_factor * bracket_factor;
667
+ }
668
+ count
669
+ }
670
+
671
+ fn count_special_fixtures(cfg: &GenConfig) -> usize {
672
+ let bracket_factor = match cfg.bracket_mode {
673
+ BracketMode::Global => cfg.brackets.len(),
674
+ BracketMode::PerPart => cfg.brackets.len(),
675
+ };
676
+ built_in_specials().len() * bracket_factor
677
+ }
678
+
679
+ fn factorial(n: u32) -> u128 {
680
+ (1..=n as u128).product::<u128>().max(1)
681
+ }
682
+
683
+ fn generate_for_sample(
684
+ sample: &SourceSample,
685
+ cfg: &GenConfig,
686
+ vocab: &Vocab,
687
+ writer: &mut ShardWriter,
688
+ ) -> Result<()> {
689
+ if cfg.include_original {
690
+ let (input_ids, attention_mask, labels) =
691
+ encode_original_sample(sample, vocab, cfg.max_length)?;
692
+ writer.add(&input_ids, &attention_mask, &labels)?;
693
+ }
694
+
695
+ if cfg.samples_per_source > 0 {
696
+ generate_sampled_variants(sample, cfg, vocab, writer)?;
697
+ return Ok(());
698
+ }
699
+
700
+ let available = ENTITIES
701
+ .iter()
702
+ .copied()
703
+ .filter(|entity| !sample.fields[entity.index()].is_empty())
704
+ .collect::<Vec<_>>();
705
+ let n = available.len();
706
+ for mask in 1usize..(1usize << n) {
707
+ let mut selected = available
708
+ .iter()
709
+ .enumerate()
710
+ .filter_map(|(idx, entity)| ((mask & (1usize << idx)) != 0).then_some(*entity))
711
+ .collect::<Vec<_>>();
712
+ permute_entities(&mut selected, 0, &mut |order| {
713
+ let mut parts: Vec<PartChoice> = Vec::with_capacity(order.len());
714
+ for_each_value_combo(order, &sample.fields, 0, &mut parts, &mut |combo| {
715
+ emit_syntax_variants(combo, cfg, vocab, writer)
716
+ })
717
+ })?;
718
+ }
719
+ Ok(())
720
+ }
721
+
722
+ fn generate_sampled_variants(
723
+ sample: &SourceSample,
724
+ cfg: &GenConfig,
725
+ vocab: &Vocab,
726
+ writer: &mut ShardWriter,
727
+ ) -> Result<()> {
728
+ let mut rng = StdRng::seed_from_u64(
729
+ cfg.seed ^ ((sample.row_index as u64).wrapping_mul(0x9E37_79B9_7F4A_7C15)),
730
+ );
731
+ let available = ENTITIES
732
+ .iter()
733
+ .copied()
734
+ .filter(|entity| !sample.fields[entity.index()].is_empty())
735
+ .collect::<Vec<_>>();
736
+ if available.is_empty() {
737
+ return Ok(());
738
+ }
739
+
740
+ let mut seen = HashSet::new();
741
+ let mut emitted = 0usize;
742
+ let budget = cfg.samples_per_source;
743
+ let max_unique_attempts = budget.saturating_mul(32).max(64);
744
+ let mut attempts = 0usize;
745
+
746
+ let mut templates: Vec<Vec<PartChoice>> = Vec::new();
747
+ if let Some(title) = sample.fields[Entity::Title.index()].first() {
748
+ templates.push(vec![PartChoice {
749
+ entity: Entity::Title,
750
+ value: title.clone(),
751
+ }]);
752
+ if let Some(season) = sample.fields[Entity::Season.index()].first() {
753
+ templates.push(vec![
754
+ PartChoice {
755
+ entity: Entity::Title,
756
+ value: title.clone(),
757
+ },
758
+ PartChoice {
759
+ entity: Entity::Season,
760
+ value: season.clone(),
761
+ },
762
+ ]);
763
+ }
764
+ }
765
+ if let Some(episode) = sample.fields[Entity::Episode.index()].first() {
766
+ templates.push(vec![PartChoice {
767
+ entity: Entity::Episode,
768
+ value: episode.clone(),
769
+ }]);
770
+ }
771
+ if let Some(special) = sample.fields[Entity::Special.index()].first() {
772
+ templates.push(vec![PartChoice {
773
+ entity: Entity::Special,
774
+ value: special.clone(),
775
+ }]);
776
+ }
777
+ if let (Some(title), Some(special)) = (
778
+ sample.fields[Entity::Title.index()].first(),
779
+ sample.fields[Entity::Special.index()].first(),
780
+ ) {
781
+ templates.push(vec![
782
+ PartChoice {
783
+ entity: Entity::Title,
784
+ value: title.clone(),
785
+ },
786
+ PartChoice {
787
+ entity: Entity::Special,
788
+ value: special.clone(),
789
+ },
790
+ ]);
791
+ }
792
+
793
+ for parts in templates {
794
+ if emitted >= budget {
795
+ break;
796
+ }
797
+ emit_sample_variant(
798
+ parts,
799
+ cfg,
800
+ vocab,
801
+ writer,
802
+ &mut seen,
803
+ &mut emitted,
804
+ &mut rng,
805
+ false,
806
+ )?;
807
+ }
808
+
809
+ while emitted < budget && attempts < max_unique_attempts {
810
+ attempts += 1;
811
+ let subset_size = match rng.gen_range(0..100) {
812
+ 0..=29 => 1,
813
+ 30..=54 => 2,
814
+ 55..=74 => 3,
815
+ 75..=89 => 4.min(available.len()),
816
+ _ => available.len().min(5),
817
+ }
818
+ .max(1)
819
+ .min(available.len());
820
+
821
+ let mut chosen = available
822
+ .choose_multiple(&mut rng, subset_size)
823
+ .copied()
824
+ .collect::<Vec<_>>();
825
+ chosen.shuffle(&mut rng);
826
+ if !chosen
827
+ .iter()
828
+ .any(|entity| matches!(entity, Entity::Title | Entity::Episode | Entity::Special))
829
+ {
830
+ if let Some(fallback) = available
831
+ .iter()
832
+ .copied()
833
+ .find(|entity| matches!(entity, Entity::Title | Entity::Episode | Entity::Special))
834
+ {
835
+ if !chosen.contains(&fallback) {
836
+ chosen.push(fallback);
837
+ }
838
+ }
839
+ }
840
+
841
+ let mut parts = Vec::with_capacity(chosen.len());
842
+ for entity in chosen {
843
+ let values = &sample.fields[entity.index()];
844
+ let value = values.choose(&mut rng).cloned().unwrap_or_default();
845
+ parts.push(PartChoice { entity, value });
846
+ }
847
+ parts.shuffle(&mut rng);
848
+ emit_sample_variant(
849
+ parts,
850
+ cfg,
851
+ vocab,
852
+ writer,
853
+ &mut seen,
854
+ &mut emitted,
855
+ &mut rng,
856
+ false,
857
+ )?;
858
+ }
859
+
860
+ while emitted < budget {
861
+ let subset_size = match rng.gen_range(0..100) {
862
+ 0..=29 => 1,
863
+ 30..=54 => 2,
864
+ 55..=74 => 3,
865
+ 75..=89 => 4.min(available.len()),
866
+ _ => available.len().min(5),
867
+ }
868
+ .max(1)
869
+ .min(available.len());
870
+
871
+ let mut chosen = available
872
+ .choose_multiple(&mut rng, subset_size)
873
+ .copied()
874
+ .collect::<Vec<_>>();
875
+ chosen.shuffle(&mut rng);
876
+ if !chosen
877
+ .iter()
878
+ .any(|entity| matches!(entity, Entity::Title | Entity::Episode | Entity::Special))
879
+ {
880
+ if let Some(fallback) = available
881
+ .iter()
882
+ .copied()
883
+ .find(|entity| matches!(entity, Entity::Title | Entity::Episode | Entity::Special))
884
+ {
885
+ if !chosen.contains(&fallback) {
886
+ chosen.push(fallback);
887
+ }
888
+ }
889
+ }
890
+
891
+ let mut parts = Vec::with_capacity(chosen.len());
892
+ for entity in chosen {
893
+ let values = &sample.fields[entity.index()];
894
+ let value = values.choose(&mut rng).cloned().unwrap_or_default();
895
+ parts.push(PartChoice { entity, value });
896
+ }
897
+ parts.shuffle(&mut rng);
898
+ emit_sample_variant(
899
+ parts,
900
+ cfg,
901
+ vocab,
902
+ writer,
903
+ &mut seen,
904
+ &mut emitted,
905
+ &mut rng,
906
+ true,
907
+ )?;
908
+ }
909
+ Ok(())
910
+ }
911
+
912
+ fn emit_sample_variant(
913
+ parts: Vec<PartChoice>,
914
+ cfg: &GenConfig,
915
+ vocab: &Vocab,
916
+ writer: &mut ShardWriter,
917
+ seen: &mut HashSet<String>,
918
+ emitted: &mut usize,
919
+ rng: &mut StdRng,
920
+ allow_duplicate: bool,
921
+ ) -> Result<()> {
922
+ if *emitted >= cfg.samples_per_source {
923
+ return Ok(());
924
+ }
925
+ if parts.is_empty() {
926
+ return Ok(());
927
+ }
928
+ let separators = match cfg.separator_mode {
929
+ SeparatorMode::Global => {
930
+ let sep = cfg
931
+ .separators
932
+ .choose(rng)
933
+ .cloned()
934
+ .unwrap_or_else(|| " ".to_string());
935
+ if parts.len() > 1 {
936
+ vec![sep; parts.len() - 1]
937
+ } else {
938
+ Vec::new()
939
+ }
940
+ }
941
+ SeparatorMode::PerGap => {
942
+ let mut values = Vec::with_capacity(parts.len().saturating_sub(1));
943
+ for _ in 0..parts.len().saturating_sub(1) {
944
+ values.push(
945
+ cfg.separators
946
+ .choose(rng)
947
+ .cloned()
948
+ .unwrap_or_else(|| " ".to_string()),
949
+ );
950
+ }
951
+ values
952
+ }
953
+ };
954
+ let brackets = match cfg.bracket_mode {
955
+ BracketMode::Global => {
956
+ let bracket = cfg
957
+ .brackets
958
+ .choose(rng)
959
+ .cloned()
960
+ .unwrap_or_else(|| Bracket {
961
+ name: "none".to_string(),
962
+ open: String::new(),
963
+ close: String::new(),
964
+ });
965
+ vec![bracket; parts.len()]
966
+ }
967
+ BracketMode::PerPart => {
968
+ let mut values = Vec::with_capacity(parts.len());
969
+ for _ in 0..parts.len() {
970
+ values.push(
971
+ cfg.brackets
972
+ .choose(rng)
973
+ .cloned()
974
+ .unwrap_or_else(|| Bracket {
975
+ name: "none".to_string(),
976
+ open: String::new(),
977
+ close: String::new(),
978
+ }),
979
+ );
980
+ }
981
+ values
982
+ }
983
+ };
984
+ let text = render_variant_text(&parts, &separators, &brackets);
985
+ if !allow_duplicate && !seen.insert(text) {
986
+ return Ok(());
987
+ }
988
+ let (input_ids, attention_mask, labels) =
989
+ encode_generated_sample(&parts, &separators, &brackets, vocab, cfg.max_length)?;
990
+ writer.add(&input_ids, &attention_mask, &labels)?;
991
+ *emitted += 1;
992
+ Ok(())
993
+ }
994
+
995
+ fn permute_entities<F>(values: &mut [Entity], start: usize, callback: &mut F) -> Result<()>
996
+ where
997
+ F: FnMut(&[Entity]) -> Result<()>,
998
+ {
999
+ if start >= values.len() {
1000
+ return callback(values);
1001
+ }
1002
+ for idx in start..values.len() {
1003
+ values.swap(start, idx);
1004
+ permute_entities(values, start + 1, callback)?;
1005
+ values.swap(start, idx);
1006
+ }
1007
+ Ok(())
1008
+ }
1009
+
1010
+ #[derive(Clone)]
1011
+ struct PartChoice {
1012
+ entity: Entity,
1013
+ value: String,
1014
+ }
1015
+
1016
+ fn for_each_value_combo<F>(
1017
+ order: &[Entity],
1018
+ fields: &[Vec<String>],
1019
+ idx: usize,
1020
+ current: &mut Vec<PartChoice>,
1021
+ callback: &mut F,
1022
+ ) -> Result<()>
1023
+ where
1024
+ F: FnMut(&[PartChoice]) -> Result<()>,
1025
+ {
1026
+ if idx >= order.len() {
1027
+ return callback(current);
1028
+ }
1029
+ let entity = order[idx];
1030
+ for value in &fields[entity.index()] {
1031
+ current.push(PartChoice {
1032
+ entity,
1033
+ value: value.clone(),
1034
+ });
1035
+ for_each_value_combo(order, fields, idx + 1, current, callback)?;
1036
+ current.pop();
1037
+ }
1038
+ Ok(())
1039
+ }
1040
+
1041
+ fn emit_syntax_variants(
1042
+ parts: &[PartChoice],
1043
+ cfg: &GenConfig,
1044
+ vocab: &Vocab,
1045
+ writer: &mut ShardWriter,
1046
+ ) -> Result<()> {
1047
+ let gaps = parts.len().saturating_sub(1);
1048
+ let mut separators = Vec::with_capacity(gaps);
1049
+ for_each_separator_combo(gaps, cfg, 0, &mut separators, &mut |sep_combo| {
1050
+ let mut brackets = Vec::with_capacity(parts.len());
1051
+ for_each_bracket_combo(parts.len(), cfg, 0, &mut brackets, &mut |bracket_combo| {
1052
+ let (input_ids, attention_mask, labels) =
1053
+ encode_generated_sample(parts, sep_combo, bracket_combo, vocab, cfg.max_length)?;
1054
+ writer.add(&input_ids, &attention_mask, &labels)
1055
+ })
1056
+ })
1057
+ }
1058
+
1059
+ fn for_each_separator_combo<F>(
1060
+ gaps: usize,
1061
+ cfg: &GenConfig,
1062
+ idx: usize,
1063
+ current: &mut Vec<String>,
1064
+ callback: &mut F,
1065
+ ) -> Result<()>
1066
+ where
1067
+ F: FnMut(&[String]) -> Result<()>,
1068
+ {
1069
+ if gaps == 0 {
1070
+ return callback(current);
1071
+ }
1072
+ match cfg.separator_mode {
1073
+ SeparatorMode::Global => {
1074
+ if idx == 0 {
1075
+ for sep in &cfg.separators {
1076
+ current.clear();
1077
+ current.resize(gaps, sep.clone());
1078
+ callback(current)?;
1079
+ }
1080
+ }
1081
+ Ok(())
1082
+ }
1083
+ SeparatorMode::PerGap => {
1084
+ if idx >= gaps {
1085
+ return callback(current);
1086
+ }
1087
+ for sep in &cfg.separators {
1088
+ current.push(sep.clone());
1089
+ for_each_separator_combo(gaps, cfg, idx + 1, current, callback)?;
1090
+ current.pop();
1091
+ }
1092
+ Ok(())
1093
+ }
1094
+ }
1095
+ }
1096
+
1097
+ fn for_each_bracket_combo<F>(
1098
+ parts: usize,
1099
+ cfg: &GenConfig,
1100
+ idx: usize,
1101
+ current: &mut Vec<Bracket>,
1102
+ callback: &mut F,
1103
+ ) -> Result<()>
1104
+ where
1105
+ F: FnMut(&[Bracket]) -> Result<()>,
1106
+ {
1107
+ match cfg.bracket_mode {
1108
+ BracketMode::Global => {
1109
+ if idx == 0 {
1110
+ for bracket in &cfg.brackets {
1111
+ current.clear();
1112
+ current.resize(parts, bracket.clone());
1113
+ callback(current)?;
1114
+ }
1115
+ }
1116
+ Ok(())
1117
+ }
1118
+ BracketMode::PerPart => {
1119
+ if idx >= parts {
1120
+ return callback(current);
1121
+ }
1122
+ for bracket in &cfg.brackets {
1123
+ current.push(bracket.clone());
1124
+ for_each_bracket_combo(parts, cfg, idx + 1, current, callback)?;
1125
+ current.pop();
1126
+ }
1127
+ Ok(())
1128
+ }
1129
+ }
1130
+ }
1131
+
1132
+ fn render_variant_text(
1133
+ parts: &[PartChoice],
1134
+ separators: &[String],
1135
+ brackets: &[Bracket],
1136
+ ) -> String {
1137
+ let mut text = String::new();
1138
+ for (idx, part) in parts.iter().enumerate() {
1139
+ text.push_str(&brackets[idx].open);
1140
+ text.push_str(&part.value);
1141
+ text.push_str(&brackets[idx].close);
1142
+ if idx < separators.len() {
1143
+ text.push_str(&separators[idx]);
1144
+ }
1145
+ }
1146
+ text
1147
+ }
1148
+
1149
+ fn encode_original_sample(
1150
+ sample: &SourceSample,
1151
+ vocab: &Vocab,
1152
+ max_length: usize,
1153
+ ) -> Result<(Vec<u16>, Vec<u8>, Vec<i16>)> {
1154
+ let mut input_ids = vec![vocab.pad_id; max_length];
1155
+ let mut attention_mask = vec![0u8; max_length];
1156
+ let mut labels = vec![-100i16; max_length];
1157
+
1158
+ input_ids[0] = vocab.cls_id;
1159
+ attention_mask[0] = 1;
1160
+ let available = max_length.saturating_sub(2);
1161
+ let token_count = sample.tokens.len().min(available);
1162
+ for idx in 0..token_count {
1163
+ input_ids[idx + 1] = token_id(vocab, &sample.tokens[idx]);
1164
+ attention_mask[idx + 1] = 1;
1165
+ labels[idx + 1] = label_id(&sample.labels[idx]).with_context(|| {
1166
+ format!(
1167
+ "unknown label '{}' on source row {} ({})",
1168
+ sample.labels[idx],
1169
+ sample.row_index + 1,
1170
+ sample.filename
1171
+ )
1172
+ })?;
1173
+ }
1174
+ let sep_pos = token_count + 1;
1175
+ input_ids[sep_pos] = vocab.sep_id;
1176
+ attention_mask[sep_pos] = 1;
1177
+ Ok((input_ids, attention_mask, labels))
1178
+ }
1179
+
1180
+ fn encode_generated_sample(
1181
+ parts: &[PartChoice],
1182
+ separators: &[String],
1183
+ brackets: &[Bracket],
1184
+ vocab: &Vocab,
1185
+ max_length: usize,
1186
+ ) -> Result<(Vec<u16>, Vec<u8>, Vec<i16>)> {
1187
+ let mut input_ids = vec![vocab.pad_id; max_length];
1188
+ let mut attention_mask = vec![0u8; max_length];
1189
+ let mut labels = vec![-100i16; max_length];
1190
+ input_ids[0] = vocab.cls_id;
1191
+ attention_mask[0] = 1;
1192
+
1193
+ let available = max_length.saturating_sub(2);
1194
+ let mut pos = 1usize;
1195
+ for (idx, part) in parts.iter().enumerate() {
1196
+ let bracket = &brackets[idx];
1197
+ append_o_text(
1198
+ &bracket.open,
1199
+ vocab,
1200
+ available,
1201
+ &mut pos,
1202
+ &mut input_ids,
1203
+ &mut attention_mask,
1204
+ &mut labels,
1205
+ );
1206
+ append_entity_text(
1207
+ &part.value,
1208
+ part.entity,
1209
+ vocab,
1210
+ available,
1211
+ &mut pos,
1212
+ &mut input_ids,
1213
+ &mut attention_mask,
1214
+ &mut labels,
1215
+ )?;
1216
+ append_o_text(
1217
+ &bracket.close,
1218
+ vocab,
1219
+ available,
1220
+ &mut pos,
1221
+ &mut input_ids,
1222
+ &mut attention_mask,
1223
+ &mut labels,
1224
+ );
1225
+ if idx < separators.len() {
1226
+ append_o_text(
1227
+ &separators[idx],
1228
+ vocab,
1229
+ available,
1230
+ &mut pos,
1231
+ &mut input_ids,
1232
+ &mut attention_mask,
1233
+ &mut labels,
1234
+ );
1235
+ }
1236
+ }
1237
+
1238
+ let sep_pos = pos.min(max_length - 1);
1239
+ input_ids[sep_pos] = vocab.sep_id;
1240
+ attention_mask[sep_pos] = 1;
1241
+ labels[sep_pos] = -100;
1242
+ Ok((input_ids, attention_mask, labels))
1243
+ }
1244
+
1245
+ fn append_o_text(
1246
+ text: &str,
1247
+ vocab: &Vocab,
1248
+ available: usize,
1249
+ pos: &mut usize,
1250
+ input_ids: &mut [u16],
1251
+ attention_mask: &mut [u8],
1252
+ labels: &mut [i16],
1253
+ ) {
1254
+ for ch in text.chars() {
1255
+ if *pos > available {
1256
+ return;
1257
+ }
1258
+ let token = ch.to_string();
1259
+ input_ids[*pos] = token_id(vocab, &token);
1260
+ attention_mask[*pos] = 1;
1261
+ labels[*pos] = 0;
1262
+ *pos += 1;
1263
+ }
1264
+ }
1265
+
1266
+ fn append_entity_text(
1267
+ text: &str,
1268
+ entity: Entity,
1269
+ vocab: &Vocab,
1270
+ available: usize,
1271
+ pos: &mut usize,
1272
+ input_ids: &mut [u16],
1273
+ attention_mask: &mut [u8],
1274
+ labels: &mut [i16],
1275
+ ) -> Result<()> {
1276
+ let b = label_id(entity.b_label()).context("missing B label")?;
1277
+ let i = label_id(entity.i_label()).context("missing I label")?;
1278
+ let mut first = true;
1279
+ for ch in text.chars() {
1280
+ if *pos > available {
1281
+ return Ok(());
1282
+ }
1283
+ let token = ch.to_string();
1284
+ input_ids[*pos] = token_id(vocab, &token);
1285
+ attention_mask[*pos] = 1;
1286
+ labels[*pos] = if first { b } else { i };
1287
+ first = false;
1288
+ *pos += 1;
1289
+ }
1290
+ Ok(())
1291
+ }
1292
+
1293
+ fn token_id(vocab: &Vocab, token: &str) -> u16 {
1294
+ *vocab.ids.get(token).unwrap_or(&vocab.unk_id)
1295
+ }
1296
+
1297
+ fn label_id(label: &str) -> Option<i16> {
1298
+ Some(match label {
1299
+ "O" => 0,
1300
+ "B-TITLE" => 1,
1301
+ "I-TITLE" => 2,
1302
+ "B-SEASON" => 3,
1303
+ "I-SEASON" => 4,
1304
+ "B-EPISODE" => 5,
1305
+ "I-EPISODE" => 6,
1306
+ "B-SPECIAL" => 7,
1307
+ "I-SPECIAL" => 8,
1308
+ "B-GROUP" => 9,
1309
+ "I-GROUP" => 10,
1310
+ "B-RESOLUTION" => 11,
1311
+ "I-RESOLUTION" => 12,
1312
+ "B-SOURCE" => 13,
1313
+ "I-SOURCE" => 14,
1314
+ _ => return None,
1315
+ })
1316
+ }
1317
+
1318
+ fn built_in_specials() -> Vec<String> {
1319
+ let mut values = Vec::new();
1320
+ values.push("Menu".to_string());
1321
+ for idx in 1..=24 {
1322
+ values.push(format!("Menu{idx:02}"));
1323
+ values.push(format!("Menu {idx:02}"));
1324
+ values.push(format!("BDMenu{idx:02}"));
1325
+ values.push(format!("BD Menu{idx:02}"));
1326
+ values.push(format!("Menu{idx:02}-01"));
1327
+ values.push(format!("ED E{idx:02}"));
1328
+ }
1329
+ for idx in 1..=6 {
1330
+ values.push(format!("OP{idx:02}"));
1331
+ values.push(format!("NCOP{idx:02}"));
1332
+ values.push(format!("NCED{idx:02}"));
1333
+ }
1334
+ for idx in 1..=12 {
1335
+ values.push(format!("CM{idx:02}"));
1336
+ values.push(format!("PV{idx:02}"));
1337
+ }
1338
+ values
1339
+ }
1340
+
1341
+ fn write_npy_u16(path: &Path, data: &[u16], rows: usize, cols: usize) -> Result<()> {
1342
+ let mut writer = BufWriter::new(
1343
+ File::create(path).with_context(|| format!("failed to create {}", path.display()))?,
1344
+ );
1345
+ write_npy_header(&mut writer, "<u2", rows, cols)?;
1346
+ for value in data {
1347
+ writer.write_all(&value.to_le_bytes())?;
1348
+ }
1349
+ Ok(())
1350
+ }
1351
+
1352
+ fn write_npy_u8(path: &Path, data: &[u8], rows: usize, cols: usize) -> Result<()> {
1353
+ let mut writer = BufWriter::new(
1354
+ File::create(path).with_context(|| format!("failed to create {}", path.display()))?,
1355
+ );
1356
+ write_npy_header(&mut writer, "|u1", rows, cols)?;
1357
+ writer.write_all(data)?;
1358
+ Ok(())
1359
+ }
1360
+
1361
+ fn write_npy_i16(path: &Path, data: &[i16], rows: usize, cols: usize) -> Result<()> {
1362
+ let mut writer = BufWriter::new(
1363
+ File::create(path).with_context(|| format!("failed to create {}", path.display()))?,
1364
+ );
1365
+ write_npy_header(&mut writer, "<i2", rows, cols)?;
1366
+ for value in data {
1367
+ writer.write_all(&value.to_le_bytes())?;
1368
+ }
1369
+ Ok(())
1370
+ }
1371
+
1372
+ fn write_npy_header<W: Write>(writer: &mut W, descr: &str, rows: usize, cols: usize) -> Result<()> {
1373
+ let mut header = format!(
1374
+ "{{'descr': '{}', 'fortran_order': False, 'shape': ({}, {}), }}",
1375
+ descr, rows, cols
1376
+ )
1377
+ .into_bytes();
1378
+ let preamble_len = 10usize;
1379
+ let pad_len = (16 - ((preamble_len + header.len() + 1) % 16)) % 16;
1380
+ header.extend(std::iter::repeat(b' ').take(pad_len));
1381
+ header.push(b'\n');
1382
+ if header.len() > u16::MAX as usize {
1383
+ bail!("npy header too large");
1384
+ }
1385
+ writer.write_all(b"\x93NUMPY")?;
1386
+ writer.write_all(&[1, 0])?;
1387
+ writer.write_all(&(header.len() as u16).to_le_bytes())?;
1388
+ writer.write_all(&header)?;
1389
+ Ok(())
1390
+ }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d63537ca05eba3f844cc92336013aea5363ab8ba9576f63d50bceda924b5075
3
  size 5329
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:127d963464cb7b39ecd0da42045aacf77e1daa49906419c443d058c4282adf61
3
  size 5329