upload inference results on 500k-sample test set using checkpoint 200

Browse files

Files changed (2) hide show

inference/graphcodebert-robust/inference.log +17 -139
inference/graphcodebert-robust/submission.csv +0 -0

inference/graphcodebert-robust/inference.log CHANGED Viewed

@@ -1,6 +1,6 @@
-2026-04-16 09:16:18,911 - INFO - Loading model and tokenizer from: ./checkpoints/graphcodebert-robust/checkpoint-200
-2026-04-16 09:16:20,848 - INFO - ===== Model Architecture =====
-2026-04-16 09:16:20,851 - INFO -
 RobertaForSequenceClassification(
   (roberta): RobertaModel(
     (embeddings): RobertaEmbeddings(
@@ -45,139 +45,17 @@ RobertaForSequenceClassification(
     (out_proj): Linear(in_features=768, out_features=2, bias=True)
   )
 )
-2026-04-16 09:16:20,853 - INFO - ===== Parameter Summary =====
-2026-04-16 09:16:20,855 - INFO - Total Parameters:         124,647,170
-2026-04-16 09:16:20,857 - INFO - Trainable Parameters:     124,647,170
-2026-04-16 09:16:20,858 - INFO - Non-trainable Parameters: 0
-2026-04-16 09:16:20,859 - INFO - ===== Tokenizer Summary =====
-2026-04-16 09:16:20,874 - INFO - Vocab size: 50265 | Special tokens: ['<s>', '</s>', '<unk>', '<pad>', '<mask>']
-2026-04-16 09:16:20,876 - INFO - ===== End of Architecture Log =====
-2026-04-16 09:16:21,287 - INFO - Loading dataset: DaniilOr/SemEval-2026-Task13 (A)
-2026-04-16 09:16:28,538 - INFO - Tokenizing dataset...
-2026-04-16 09:16:29,324 - INFO - Running inference on 1000 examples...
-2026-04-16 09:16:55,877 - INFO - Calculating classification metrics...
-2026-04-16 09:16:55,902 - INFO - ------------------------------
-2026-04-16 09:16:55,904 - INFO - METRICS FOR SPLIT: test
-2026-04-16 09:16:55,905 - INFO - Accuracy:  0.5030
-2026-04-16 09:16:55,907 - INFO - Precision: 0.6228
-2026-04-16 09:16:55,909 - INFO - Recall:    0.5030
-2026-04-16 09:16:55,910 - INFO - F1-Score:  0.5438
-2026-04-16 09:16:55,912 - INFO - ------------------------------
-2026-04-16 09:16:55,918 - INFO - Confusion Matrix:
-[[422 355]
- [142  81]]
-2026-04-16 09:16:55,921 - INFO - ✅ Predictions saved to test/inference/graphcodebert-robust/submission.csv
-2026-04-16 10:06:49,138 - INFO - Loading model and tokenizer from: ./output_checkpoints/graphcodebert-robust/checkpoint-1000
-2026-04-16 10:06:49,327 - INFO - ===== Model Architecture =====
-2026-04-16 10:06:49,331 - INFO -
-RobertaForSequenceClassification(
-  (roberta): RobertaModel(
-    (embeddings): RobertaEmbeddings(
-      (word_embeddings): Embedding(50265, 768, padding_idx=1)
-      (position_embeddings): Embedding(514, 768, padding_idx=1)
-      (token_type_embeddings): Embedding(1, 768)
-      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
-      (dropout): Dropout(p=0.2, inplace=False)
-    )
-    (encoder): RobertaEncoder(
-      (layer): ModuleList(
-        (0-11): 12 x RobertaLayer(
-          (attention): RobertaAttention(
-            (self): RobertaSdpaSelfAttention(
-              (query): Linear(in_features=768, out_features=768, bias=True)
-              (key): Linear(in_features=768, out_features=768, bias=True)
-              (value): Linear(in_features=768, out_features=768, bias=True)
-              (dropout): Dropout(p=0.2, inplace=False)
-            )
-            (output): RobertaSelfOutput(
-              (dense): Linear(in_features=768, out_features=768, bias=True)
-              (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
-              (dropout): Dropout(p=0.2, inplace=False)
-            )
-          )
-          (intermediate): RobertaIntermediate(
-            (dense): Linear(in_features=768, out_features=3072, bias=True)
-            (intermediate_act_fn): GELUActivation()
-          )
-          (output): RobertaOutput(
-            (dense): Linear(in_features=3072, out_features=768, bias=True)
-            (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
-            (dropout): Dropout(p=0.2, inplace=False)
-          )
-        )
-      )
-    )
-  )
-  (classifier): RobertaClassificationHead(
-    (dense): Linear(in_features=768, out_features=768, bias=True)
-    (dropout): Dropout(p=0.2, inplace=False)
-    (out_proj): Linear(in_features=768, out_features=2, bias=True)
-  )
-)
-2026-04-16 10:06:49,337 - INFO - ===== Parameter Summary =====
-2026-04-16 10:06:49,340 - INFO - Total Parameters:         124,647,170
-2026-04-16 10:06:49,343 - INFO - Trainable Parameters:     124,647,170
-2026-04-16 10:06:49,346 - INFO - Non-trainable Parameters: 0
-2026-04-16 10:06:49,349 - INFO - ===== Tokenizer Summary =====
-2026-04-16 10:06:49,366 - INFO - Vocab size: 50265 | Special tokens: ['<s>', '</s>', '<unk>', '<pad>', '<mask>']
-2026-04-16 10:06:49,369 - INFO - ===== End of Architecture Log =====
-2026-04-16 10:06:49,539 - INFO - Loading dataset: dzungpham/SemEval-2026-TaskA-dataset (default)
-2026-04-16 10:08:44,659 - INFO - Loading model and tokenizer from: ./output_checkpoints/graphcodebert-robust/checkpoint-1000
-2026-04-16 10:08:44,856 - INFO - ===== Model Architecture =====
-2026-04-16 10:08:44,861 - INFO -
-RobertaForSequenceClassification(
-  (roberta): RobertaModel(
-    (embeddings): RobertaEmbeddings(
-      (word_embeddings): Embedding(50265, 768, padding_idx=1)
-      (position_embeddings): Embedding(514, 768, padding_idx=1)
-      (token_type_embeddings): Embedding(1, 768)
-      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
-      (dropout): Dropout(p=0.2, inplace=False)
-    )
-    (encoder): RobertaEncoder(
-      (layer): ModuleList(
-        (0-11): 12 x RobertaLayer(
-          (attention): RobertaAttention(
-            (self): RobertaSdpaSelfAttention(
-              (query): Linear(in_features=768, out_features=768, bias=True)
-              (key): Linear(in_features=768, out_features=768, bias=True)
-              (value): Linear(in_features=768, out_features=768, bias=True)
-              (dropout): Dropout(p=0.2, inplace=False)
-            )
-            (output): RobertaSelfOutput(
-              (dense): Linear(in_features=768, out_features=768, bias=True)
-              (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
-              (dropout): Dropout(p=0.2, inplace=False)
-            )
-          )
-          (intermediate): RobertaIntermediate(
-            (dense): Linear(in_features=768, out_features=3072, bias=True)
-            (intermediate_act_fn): GELUActivation()
-          )
-          (output): RobertaOutput(
-            (dense): Linear(in_features=3072, out_features=768, bias=True)
-            (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
-            (dropout): Dropout(p=0.2, inplace=False)
-          )
-        )
-      )
-    )
-  )
-  (classifier): RobertaClassificationHead(
-    (dense): Linear(in_features=768, out_features=768, bias=True)
-    (dropout): Dropout(p=0.2, inplace=False)
-    (out_proj): Linear(in_features=768, out_features=2, bias=True)
-  )
-)
-2026-04-16 10:08:44,865 - INFO - ===== Parameter Summary =====
-2026-04-16 10:08:44,867 - INFO - Total Parameters:         124,647,170
-2026-04-16 10:08:44,870 - INFO - Trainable Parameters:     124,647,170
-2026-04-16 10:08:44,874 - INFO - Non-trainable Parameters: 0
-2026-04-16 10:08:44,876 - INFO - ===== Tokenizer Summary =====
-2026-04-16 10:08:44,893 - INFO - Vocab size: 50265 | Special tokens: ['<s>', '</s>', '<unk>', '<pad>', '<mask>']
-2026-04-16 10:08:44,896 - INFO - ===== End of Architecture Log =====
-2026-04-16 10:08:45,082 - INFO - Loading dataset: dzungpham/SemEval-2026-TaskA-dataset (default)
-2026-04-16 10:08:51,304 - WARNING - Default loading failed due to schema mismatch: An error occurred while generating the dataset
-2026-04-16 10:08:51,307 - INFO - Attempting to load split 'test' using data_files...
-2026-04-16 10:08:55,114 - INFO - Tokenizing dataset...
-2026-04-16 10:14:03,634 - INFO - Running inference on 500000 examples...

+2026-04-16 10:42:50,167 - INFO - Loading model and tokenizer from: checkpoints/graphcodebert-robust/checkpoint-200
+2026-04-16 10:42:50,469 - INFO - ===== Model Architecture =====
+2026-04-16 10:42:50,471 - INFO -
 RobertaForSequenceClassification(
   (roberta): RobertaModel(
     (embeddings): RobertaEmbeddings(
     (out_proj): Linear(in_features=768, out_features=2, bias=True)
   )
 )
+2026-04-16 10:42:50,475 - INFO - ===== Parameter Summary =====
+2026-04-16 10:42:50,478 - INFO - Total Parameters:         124,647,170
+2026-04-16 10:42:50,480 - INFO - Trainable Parameters:     124,647,170
+2026-04-16 10:42:50,483 - INFO - Non-trainable Parameters: 0
+2026-04-16 10:42:50,485 - INFO - ===== Tokenizer Summary =====
+2026-04-16 10:42:50,501 - INFO - Vocab size: 50265 | Special tokens: ['<s>', '</s>', '<unk>', '<pad>', '<mask>']
+2026-04-16 10:42:50,503 - INFO - ===== End of Architecture Log =====
+2026-04-16 10:42:50,964 - INFO - Loading dataset: dzungpham/SemEval-2026-TaskA-dataset (default)
+2026-04-16 10:43:17,351 - WARNING - Default loading failed due to schema mismatch: An error occurred while generating the dataset
+2026-04-16 10:43:17,353 - INFO - Attempting to load split 'test' using data_files...
+2026-04-16 10:43:21,380 - INFO - Tokenizing dataset...
+2026-04-16 10:48:34,156 - INFO - Running inference on 500000 examples...
+2026-04-16 15:41:53,221 - WARNING - No 'label' column found in dataset. Skipping metric calculation.
+2026-04-16 15:41:59,383 - INFO - ✅ Predictions saved to test/inference/graphcodebert-robust/submission.csv

inference/graphcodebert-robust/submission.csv CHANGED Viewed

The diff for this file is too large to render. See raw diff