| {"max_seq_len": 256, "embed_dim": 256, "num_heads": 4, "num_layers": 2, "dropout": 0.1, "batch_size": 32, "num_epochs": 10, "learning_rate": 5e-05, "weight_decay": 0.01} |
| {"max_seq_len": 256, "embed_dim": 256, "num_heads": 4, "num_layers": 2, "dropout": 0.1, "batch_size": 32, "num_epochs": 10, "learning_rate": 5e-05, "weight_decay": 0.01} |