13:4: not a valid test operator: ( 13:4: not a valid test operator: 535.86.10 2026-04-28 04:33:01.866542: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. WARNING:tensorflow:From /workspace/finetune/main_chars_lstm.py:36: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. [train] params: {"batch_size": 128, "buffer": 2000, "char_lstm_size": 25, "chars": "/workspace/finetune/data_kvkk_set2_v3/vocab.chars.txt", "dim": 50, "dim_chars": 100, "dropout": 0.5, "early_stop_max_steps": 600, "epochs": 20, "learning_rate": 0.001, "log_step_count_steps": 200, "lstm_size": 100, "min_steps": 600000, "num_oov_buckets": 1, "save_checkpoints_secs": 500, "save_summary_steps": 1000, "tags": "/workspace/finetune/data_kvkk_set2_v3/vocab.tags.txt", "trainable_embeddings": true, "vectors": "/workspace/finetune/data_kvkk_set2_v3/vectors.npz", "words": "/workspace/finetune/data_kvkk_set2_v3/vocab.words.txt"} Using config: {'_model_dir': '/workspace/finetune/results/model', '_tf_random_seed': None, '_save_summary_steps': 1000, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 500, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 200, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': , '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} Not using Distribute Coordinator. Running training and evaluation locally (non-distributed). Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 500. Calling model_fn. The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons * https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue. TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1. TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1. TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1. From /workspace/finetune/main_chars_lstm.py:104: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead. From /workspace/finetune/main_chars_lstm.py:169: The name tf.metrics.accuracy is deprecated. Please use tf.compat.v1.metrics.accuracy instead. From /py_packages/tf_metrics/__init__.py:152: The name tf.diag_part is deprecated. Please use tf.linalg.tensor_diag_part instead. From /workspace/finetune/main_chars_lstm.py:175: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead. From /workspace/finetune/main_chars_lstm.py:180: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead. From /workspace/finetune/main_chars_lstm.py:181: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead. Done calling model_fn. Create CheckpointSaverHook. Graph was finalized. 2026-04-28 04:33:05.296490: I tensorflow/core/platform/profile_utils/cpu_utils.cc:109] CPU Frequency: 2000000000 Hz 2026-04-28 04:33:05.328827: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x64d0070 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2026-04-28 04:33:05.328873: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2026-04-28 04:33:05.333954: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2026-04-28 04:33:05.523685: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x660a6c0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2026-04-28 04:33:05.523735: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA H100, Compute Capability 9.0 2026-04-28 04:33:05.524596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:33:05.524637: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:33:05.859509: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:33:05.891835: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:33:05.899990: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:33:05.908493: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:33:05.920724: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:33:05.922071: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:33:05.922505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:33:05.923841: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:33:05.929938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:33:05.929961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:33:05.929969: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:33:05.930412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Running local_init_op. Done running local_init_op. Saving checkpoints for 0 into /workspace/finetune/results/model/model.ckpt. 2026-04-28 04:33:09.576112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 loss = 117.38817, step = 0 global_step/sec: 15.9262 loss = 4.773464, step = 200 (12.558 sec) global_step/sec: 16.4469 loss = 2.8829772, step = 400 (12.160 sec) global_step/sec: 16.4739 loss = 1.6619699, step = 600 (12.140 sec) global_step/sec: 17.5305 loss = 1.4286156, step = 800 (11.409 sec) global_step/sec: 16.591 loss = 1.1506453, step = 1000 (12.055 sec) global_step/sec: 16.3338 loss = 1.6229303, step = 1200 (12.245 sec) global_step/sec: 16.319 loss = 1.1030784, step = 1400 (12.256 sec) global_step/sec: 16.0311 loss = 0.48492754, step = 1600 (12.476 sec) global_step/sec: 15.9238 loss = 0.7973819, step = 1800 (12.560 sec) global_step/sec: 15.9882 loss = 0.5444262, step = 2000 (12.510 sec) global_step/sec: 16.0443 loss = 0.34822357, step = 2200 (12.465 sec) global_step/sec: 16.0029 loss = 0.46935606, step = 2400 (12.498 sec) global_step/sec: 15.9425 loss = 0.5678671, step = 2600 (12.548 sec) global_step/sec: 16.3265 loss = 0.5309495, step = 2800 (12.247 sec) global_step/sec: 16.3088 loss = 0.7498473, step = 3000 (12.264 sec) global_step/sec: 15.5706 loss = 0.6110069, step = 3200 (12.844 sec) global_step/sec: 16.7051 loss = 0.24717766, step = 3400 (11.973 sec) global_step/sec: 16.8543 loss = 1.3274518, step = 3600 (11.866 sec) global_step/sec: 16.4027 loss = 0.33537441, step = 3800 (12.193 sec) global_step/sec: 16.2972 loss = 0.29465103, step = 4000 (12.272 sec) global_step/sec: 16.6392 loss = 0.2784903, step = 4200 (12.020 sec) global_step/sec: 16.6742 loss = 0.2184174, step = 4400 (11.994 sec) global_step/sec: 16.755 loss = 0.44008517, step = 4600 (11.937 sec) global_step/sec: 17.096 loss = 0.25767207, step = 4800 (11.698 sec) global_step/sec: 16.899 loss = 0.17028493, step = 5000 (11.835 sec) global_step/sec: 16.7529 loss = 0.17947096, step = 5200 (11.938 sec) global_step/sec: 17.0098 loss = 0.28755808, step = 5400 (11.758 sec) global_step/sec: 17.141 loss = 0.25743997, step = 5600 (11.668 sec) global_step/sec: 17.1619 loss = 0.44591618, step = 5800 (11.654 sec) global_step/sec: 17.2099 loss = 0.44130975, step = 6000 (11.621 sec) global_step/sec: 16.7787 loss = 0.9595022, step = 6200 (11.920 sec) global_step/sec: 16.5132 loss = 0.17380977, step = 6400 (12.112 sec) global_step/sec: 17.1766 loss = 0.62790394, step = 6600 (11.644 sec) global_step/sec: 16.6513 loss = 0.115854025, step = 6800 (12.011 sec) global_step/sec: 16.8948 loss = 0.36853623, step = 7000 (11.839 sec) global_step/sec: 16.8772 loss = 0.15148377, step = 7200 (11.850 sec) global_step/sec: 17.1629 loss = 0.23943508, step = 7400 (11.653 sec) global_step/sec: 16.7432 loss = 0.13623583, step = 7600 (11.945 sec) global_step/sec: 17.1363 loss = 0.13282633, step = 7800 (11.671 sec) global_step/sec: 17.5276 loss = 0.0774439, step = 8000 (11.413 sec) global_step/sec: 17.2208 loss = 0.08795112, step = 8200 (11.612 sec) Saving checkpoints for 8284 into /workspace/finetune/results/model/model.ckpt. Calling model_fn. Done calling model_fn. Starting evaluation at 2026-04-28T04:41:29Z Graph was finalized. 2026-04-28 04:41:29.885621: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:41:29.885671: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:41:29.885708: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:41:29.885715: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:41:29.885723: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:41:29.885730: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:41:29.885736: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:41:29.885743: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:41:29.886053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:41:29.886087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:41:29.886092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:41:29.886097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:41:29.886396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Restoring parameters from /workspace/finetune/results/model/model.ckpt-8284 Running local_init_op. Done running local_init_op. Evaluation [10/100] Evaluation [20/100] Evaluation [30/100] Evaluation [40/100] Evaluation [50/100] Evaluation [60/100] Evaluation [70/100] Evaluation [80/100] Evaluation [90/100] Evaluation [100/100] Finished evaluation at 2026-04-28-04:41:34 Saving dict for global step 8284: acc = 0.99759144, f1 = 0.9933167, global_step = 8284, loss = 0.17885803, precision = 0.99258435, recall = 0.99405015 Saving 'checkpoint_path' summary for global step 8284: /workspace/finetune/results/model/model.ckpt-8284 global_step/sec: 11.4752 loss = 0.24231434, step = 8400 (17.429 sec) global_step/sec: 17.3567 loss = 0.11804509, step = 8600 (11.523 sec) global_step/sec: 17.0883 loss = 0.23675942, step = 8800 (11.704 sec) global_step/sec: 17.4953 loss = 0.14836943, step = 9000 (11.432 sec) global_step/sec: 17.495 loss = 0.1359005, step = 9200 (11.432 sec) global_step/sec: 17.082 loss = 0.21677959, step = 9400 (11.709 sec) global_step/sec: 17.2371 loss = 0.14764589, step = 9600 (11.603 sec) global_step/sec: 17.2015 loss = 0.09986603, step = 9800 (11.627 sec) global_step/sec: 17.0483 loss = 0.23195946, step = 10000 (11.732 sec) global_step/sec: 17.4384 loss = 0.11996138, step = 10200 (11.469 sec) global_step/sec: 17.4722 loss = 0.077587605, step = 10400 (11.447 sec) global_step/sec: 17.2922 loss = 0.13140821, step = 10600 (11.566 sec) global_step/sec: 17.3569 loss = 0.1550746, step = 10800 (11.523 sec) global_step/sec: 17.3439 loss = 0.16950673, step = 11000 (11.531 sec) global_step/sec: 17.4338 loss = 0.1085785, step = 11200 (11.472 sec) global_step/sec: 17.5506 loss = 0.114634454, step = 11400 (11.396 sec) global_step/sec: 17.7854 loss = 0.09087747, step = 11600 (11.245 sec) global_step/sec: 17.3965 loss = 0.0742746, step = 11800 (11.497 sec) global_step/sec: 17.3043 loss = 0.11870754, step = 12000 (11.558 sec) global_step/sec: 17.6434 loss = 0.15823495, step = 12200 (11.335 sec) global_step/sec: 17.4929 loss = 0.058683336, step = 12400 (11.433 sec) global_step/sec: 17.6003 loss = 0.16609168, step = 12600 (11.364 sec) global_step/sec: 17.6682 loss = 0.2326163, step = 12800 (11.320 sec) global_step/sec: 17.6474 loss = 0.2841699, step = 13000 (11.333 sec) global_step/sec: 17.306 loss = 0.17810643, step = 13200 (11.557 sec) global_step/sec: 17.3409 loss = 0.049505234, step = 13400 (11.533 sec) global_step/sec: 17.453 loss = 0.04317367, step = 13600 (11.459 sec) global_step/sec: 17.4567 loss = 0.107658744, step = 13800 (11.457 sec) global_step/sec: 17.7515 loss = 0.040947437, step = 14000 (11.267 sec) global_step/sec: 17.5825 loss = 0.07045764, step = 14200 (11.375 sec) global_step/sec: 17.4036 loss = 0.07488018, step = 14400 (11.492 sec) global_step/sec: 17.6369 loss = 0.25159824, step = 14600 (11.340 sec) global_step/sec: 17.7241 loss = 0.08559203, step = 14800 (11.284 sec) global_step/sec: 17.4513 loss = 0.03275448, step = 15000 (11.461 sec) global_step/sec: 17.5323 loss = 0.13171089, step = 15200 (11.407 sec) global_step/sec: 17.6592 loss = 0.047302723, step = 15400 (11.326 sec) global_step/sec: 17.6405 loss = 0.13909113, step = 15600 (11.338 sec) global_step/sec: 17.6345 loss = 0.08797115, step = 15800 (11.342 sec) global_step/sec: 17.8771 loss = 0.08592886, step = 16000 (11.187 sec) global_step/sec: 17.5025 loss = 0.042692125, step = 16200 (11.427 sec) global_step/sec: 17.5749 loss = 0.050394714, step = 16400 (11.380 sec) global_step/sec: 17.6533 loss = 0.050355732, step = 16600 (11.329 sec) global_step/sec: 17.6026 loss = 0.14212108, step = 16800 (11.362 sec) Saving checkpoints for 16921 into /workspace/finetune/results/model/model.ckpt. Calling model_fn. Done calling model_fn. Starting evaluation at 2026-04-28T04:49:49Z Graph was finalized. 2026-04-28 04:49:49.668155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:49:49.668199: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:49:49.668238: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:49:49.668246: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:49:49.668254: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:49:49.668262: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:49:49.668268: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:49:49.668274: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:49:49.668591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:49:49.668623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:49:49.668628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:49:49.668633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:49:49.668917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Restoring parameters from /workspace/finetune/results/model/model.ckpt-16921 Running local_init_op. Done running local_init_op. Evaluation [10/100] Evaluation [20/100] Evaluation [30/100] Evaluation [40/100] Evaluation [50/100] Evaluation [60/100] Evaluation [70/100] Evaluation [80/100] Evaluation [90/100] Evaluation [100/100] Finished evaluation at 2026-04-28-04:49:54 Saving dict for global step 16921: acc = 0.99804306, f1 = 0.9946128, global_step = 16921, loss = 0.13094354, precision = 0.99445856, recall = 0.9947671 Saving 'checkpoint_path' summary for global step 16921: /workspace/finetune/results/model/model.ckpt-16921 global_step/sec: 12.0425 loss = 0.069523394, step = 17000 (16.608 sec) global_step/sec: 16.8534 loss = 0.03588748, step = 17200 (11.867 sec) global_step/sec: 17.389 loss = 0.060875297, step = 17400 (11.502 sec) global_step/sec: 17.5332 loss = 0.093826056, step = 17600 (11.407 sec) global_step/sec: 17.5162 loss = 0.04980874, step = 17800 (11.418 sec) global_step/sec: 17.5958 loss = 0.08436108, step = 18000 (11.367 sec) global_step/sec: 17.6499 loss = 0.07031548, step = 18200 (11.332 sec) global_step/sec: 17.4897 loss = 0.08018917, step = 18400 (11.435 sec) global_step/sec: 17.6163 loss = 0.028933227, step = 18600 (11.353 sec) global_step/sec: 17.3619 loss = 0.06399143, step = 18800 (11.519 sec) global_step/sec: 17.5151 loss = 0.104281485, step = 19000 (11.419 sec) global_step/sec: 17.5888 loss = 0.10930133, step = 19200 (11.371 sec) global_step/sec: 17.6307 loss = 0.056066155, step = 19400 (11.344 sec) global_step/sec: 17.7029 loss = 0.19223732, step = 19600 (11.298 sec) global_step/sec: 17.8645 loss = 0.13925654, step = 19800 (11.195 sec) global_step/sec: 17.6568 loss = 0.05859512, step = 20000 (11.327 sec) global_step/sec: 17.7123 loss = 0.06567615, step = 20200 (11.291 sec) global_step/sec: 17.8937 loss = 0.106875956, step = 20400 (11.177 sec) global_step/sec: 17.6091 loss = 0.03964001, step = 20600 (11.358 sec) global_step/sec: 17.4414 loss = 0.09287375, step = 20800 (11.467 sec) global_step/sec: 17.6155 loss = 0.03273791, step = 21000 (11.354 sec) global_step/sec: 17.8087 loss = 0.0277282, step = 21200 (11.230 sec) global_step/sec: 17.8569 loss = 0.07149267, step = 21400 (11.200 sec) global_step/sec: 17.9393 loss = 0.05456364, step = 21600 (11.149 sec) global_step/sec: 17.7899 loss = 0.011094809, step = 21800 (11.242 sec) global_step/sec: 17.5774 loss = 0.04235983, step = 22000 (11.378 sec) global_step/sec: 17.8274 loss = 0.056289375, step = 22200 (11.218 sec) global_step/sec: 17.8135 loss = 0.06535977, step = 22400 (11.228 sec) global_step/sec: 17.5839 loss = 0.11343026, step = 22600 (11.374 sec) global_step/sec: 17.8426 loss = 0.035084844, step = 22800 (11.209 sec) global_step/sec: 17.9744 loss = 0.02807188, step = 23000 (11.127 sec) global_step/sec: 17.6131 loss = 0.12693703, step = 23200 (11.355 sec) global_step/sec: 17.8415 loss = 0.064061224, step = 23400 (11.210 sec) global_step/sec: 17.7546 loss = 0.090565205, step = 23600 (11.265 sec) global_step/sec: 17.7205 loss = 0.0683707, step = 23800 (11.286 sec) global_step/sec: 17.8176 loss = 0.031541705, step = 24000 (11.225 sec) global_step/sec: 17.906 loss = 0.04308015, step = 24200 (11.169 sec) global_step/sec: 17.643 loss = 0.12729162, step = 24400 (11.336 sec) global_step/sec: 17.5792 loss = 0.05107689, step = 24600 (11.377 sec) global_step/sec: 17.6869 loss = 0.037356377, step = 24800 (11.308 sec) Saving checkpoints for 25000 into /workspace/finetune/results/model/model.ckpt. Calling model_fn. Done calling model_fn. Starting evaluation at 2026-04-28T04:57:32Z Graph was finalized. 2026-04-28 04:57:32.870073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:57:32.870119: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:57:32.870154: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:57:32.870162: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:57:32.870170: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:57:32.870177: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:57:32.870184: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:57:32.870191: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:57:32.870457: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:57:32.870492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:57:32.870497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:57:32.870502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:57:32.870790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Restoring parameters from /workspace/finetune/results/model/model.ckpt-25000 Running local_init_op. Done running local_init_op. Evaluation [10/100] Evaluation [20/100] Evaluation [30/100] Evaluation [40/100] Evaluation [50/100] Evaluation [60/100] Evaluation [70/100] Evaluation [80/100] Evaluation [90/100] Evaluation [100/100] Finished evaluation at 2026-04-28-04:57:37 Saving dict for global step 25000: acc = 0.99808574, f1 = 0.9946638, global_step = 25000, loss = 0.12328317, precision = 0.9939835, recall = 0.995345 Saving 'checkpoint_path' summary for global step 25000: /workspace/finetune/results/model/model.ckpt-25000 Loss for final step: 0.035551548. Calling model_fn. Done calling model_fn. Graph was finalized. 2026-04-28 04:57:37.865806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:57:37.865856: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:57:37.865881: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:57:37.865889: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:57:37.865897: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:57:37.865906: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:57:37.865912: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:57:37.865920: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:57:37.866178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:57:37.866207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:57:37.866212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:57:37.866217: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:57:37.866511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Restoring parameters from /workspace/finetune/results/model/model.ckpt-25000 Running local_init_op. Done running local_init_op. [predict] wrote /workspace/finetune/results/score/train.preds.txt Calling model_fn. Done calling model_fn. Graph was finalized. 2026-04-28 04:58:27.768805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:58:27.768847: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:58:27.768867: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:58:27.768874: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:58:27.768879: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:58:27.768884: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:58:27.768889: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:58:27.768895: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:58:27.769123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:58:27.769147: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:58:27.769151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:58:27.769155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:58:27.769420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Restoring parameters from /workspace/finetune/results/model/model.ckpt-25000 Running local_init_op. Done running local_init_op. [predict] wrote /workspace/finetune/results/score/testa.preds.txt Calling model_fn. Done calling model_fn. Graph was finalized. 2026-04-28 04:58:34.490921: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA H100 major: 9 minor: 0 memoryClockRate(GHz): 1.98 pciBusID: 0000:ad:00.0 2026-04-28 04:58:34.490967: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2026-04-28 04:58:34.491002: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2026-04-28 04:58:34.491008: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2026-04-28 04:58:34.491014: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2026-04-28 04:58:34.491020: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2026-04-28 04:58:34.491025: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2026-04-28 04:58:34.491030: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2026-04-28 04:58:34.491264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2026-04-28 04:58:34.491287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2026-04-28 04:58:34.491292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2026-04-28 04:58:34.491296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2026-04-28 04:58:34.491562: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 60325 MB memory) -> physical GPU (device: 0, name: NVIDIA H100, pci bus id: 0000:ad:00.0, compute capability: 9.0) Restoring parameters from /workspace/finetune/results/model/model.ckpt-25000 Running local_init_op. Done running local_init_op. [predict] wrote /workspace/finetune/results/score/testb.preds.txt