mmaguero commited on
Commit
d08c49e
·
verified ·
1 Parent(s): e5ac4bf

Upload folder using huggingface_hub

Browse files
models/mpqa/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/model_vi_mpqa_roberta_6.train.log ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-10-28 03:25:38 INFO
2
+ ---------------------+-------------------------------
3
+ Param | Value
4
+ ---------------------+-------------------------------
5
+ lr | 5e-05
6
+ mu | 0.9
7
+ nu | 0.999
8
+ eps | 1e-06
9
+ weight_decay | 0.0
10
+ lr_rate | 1
11
+ patience | 30
12
+ update-steps | 1
13
+ warmup | 0.0
14
+ update_steps | 1
15
+ mode | train
16
+ path | /home/marvin/structured-sentiment-analysis-bis/models/mpqa/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/model_vi_mpqa_roberta_6
17
+ device | 1
18
+ seed | 1
19
+ threads | 16
20
+ local_rank | -1
21
+ feat | None
22
+ build | True
23
+ checkpoint | False
24
+ encoder | bert
25
+ max_len | None
26
+ buckets | 32
27
+ train | /home/marvin/structured-sentiment-analysis-bis/sentiment_graphs/mpqa/head_final/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/train.conllu
28
+ dev | /home/marvin/structured-sentiment-analysis-bis/sentiment_graphs/mpqa/head_final/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/dev.conllu
29
+ test | /home/marvin/structured-sentiment-analysis-bis/sentiment_graphs/mpqa/head_final/sdp/test.conllu
30
+ embed | data/glove.6B.100d.txt
31
+ unk | unk
32
+ n_embed | 100
33
+ n_embed_proj | 125
34
+ bert | roberta-base
35
+ inference | mfvi
36
+ ---------------------+-------------------------------
37
+
38
+ 2025-10-28 03:25:38 INFO Building the fields
39
+ 2025-10-28 03:25:39 INFO CoNLL(
40
+ (words): SubwordField(pad=<pad>, unk=<unk>, bos=<s>)
41
+ (labels): ChartField()
42
+ )
43
+ 2025-10-28 03:25:39 INFO Building the model
44
+ 2025-10-28 03:25:40 INFO VISemanticDependencyModel(
45
+ (encoder): TransformerEmbedding(roberta-base, n_layers=4, n_out=768, stride=256, pooling=mean, pad_index=1, requires_grad=True)
46
+ (encoder_dropout): Dropout(p=0.33, inplace=False)
47
+ (edge_mlp_d): MLP(n_in=768, n_out=600, dropout=0.25)
48
+ (edge_mlp_h): MLP(n_in=768, n_out=600, dropout=0.25)
49
+ (label_mlp_d): MLP(n_in=768, n_out=600, dropout=0.33)
50
+ (label_mlp_h): MLP(n_in=768, n_out=600, dropout=0.33)
51
+ (edge_attn): Biaffine(n_in=600, bias_x=True, bias_y=True)
52
+ (label_attn): Biaffine(n_in=600, bias_x=True, bias_y=True)
53
+ (criterion): CrossEntropyLoss()
54
+ (pair_mlp_d): MLP(n_in=768, n_out=150, dropout=0.25)
55
+ (pair_mlp_h): MLP(n_in=768, n_out=150, dropout=0.25)
56
+ (pair_mlp_g): MLP(n_in=768, n_out=150, dropout=0.25)
57
+ (sib_attn): Triaffine(n_in=150, bias_x=True, bias_y=True)
58
+ (cop_attn): Triaffine(n_in=150, bias_x=True, bias_y=True)
59
+ (grd_attn): Triaffine(n_in=150, bias_x=True, bias_y=True)
60
+ (inference): SemanticDependencyMFVI(max_iter=3)
61
+ )
62
+
63
+ 2025-10-28 03:25:40 INFO Loading the data
64
+ 2025-10-28 03:25:42 INFO
65
+ train: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
66
+ dev: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
67
+ test: Dataset(n_sentences=2112, n_batches=32, n_buckets=32)
68
+
69
+ 2025-10-28 03:25:43 INFO Epoch 1 / 5000:
models/mpqa/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/model_vi_mpqa_xlm_6.train.log ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-10-28 03:25:46 INFO
2
+ ---------------------+-------------------------------
3
+ Param | Value
4
+ ---------------------+-------------------------------
5
+ lr | 5e-05
6
+ mu | 0.9
7
+ nu | 0.999
8
+ eps | 1e-06
9
+ weight_decay | 0.0
10
+ lr_rate | 1
11
+ patience | 30
12
+ update-steps | 1
13
+ warmup | 0.0
14
+ update_steps | 1
15
+ mode | train
16
+ path | /home/marvin/structured-sentiment-analysis-bis/models/mpqa/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/model_vi_mpqa_xlm_6
17
+ device | 1
18
+ seed | 1
19
+ threads | 16
20
+ local_rank | -1
21
+ feat | None
22
+ build | True
23
+ checkpoint | False
24
+ encoder | bert
25
+ max_len | None
26
+ buckets | 32
27
+ train | /home/marvin/structured-sentiment-analysis-bis/sentiment_graphs/mpqa/head_final/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/train.conllu
28
+ dev | /home/marvin/structured-sentiment-analysis-bis/sentiment_graphs/mpqa/head_final/sdp_byGenAI_Llama3.1_8b_Inst_desc/80/dev.conllu
29
+ test | /home/marvin/structured-sentiment-analysis-bis/sentiment_graphs/mpqa/head_final/sdp/test.conllu
30
+ embed | data/glove.6B.100d.txt
31
+ unk | unk
32
+ n_embed | 100
33
+ n_embed_proj | 125
34
+ bert | xlm-roberta-base
35
+ inference | mfvi
36
+ ---------------------+-------------------------------
37
+
38
+ 2025-10-28 03:25:46 INFO Building the fields
39
+ 2025-10-28 03:25:47 INFO CoNLL(
40
+ (words): SubwordField(pad=<pad>, unk=<unk>, bos=<s>)
41
+ (labels): ChartField()
42
+ )
43
+ 2025-10-28 03:25:47 INFO Building the model
44
+ 2025-10-28 03:25:49 INFO VISemanticDependencyModel(
45
+ (encoder): TransformerEmbedding(xlm-roberta-base, n_layers=4, n_out=768, stride=256, pooling=mean, pad_index=1, requires_grad=True)
46
+ (encoder_dropout): Dropout(p=0.33, inplace=False)
47
+ (edge_mlp_d): MLP(n_in=768, n_out=600, dropout=0.25)
48
+ (edge_mlp_h): MLP(n_in=768, n_out=600, dropout=0.25)
49
+ (label_mlp_d): MLP(n_in=768, n_out=600, dropout=0.33)
50
+ (label_mlp_h): MLP(n_in=768, n_out=600, dropout=0.33)
51
+ (edge_attn): Biaffine(n_in=600, bias_x=True, bias_y=True)
52
+ (label_attn): Biaffine(n_in=600, bias_x=True, bias_y=True)
53
+ (criterion): CrossEntropyLoss()
54
+ (pair_mlp_d): MLP(n_in=768, n_out=150, dropout=0.25)
55
+ (pair_mlp_h): MLP(n_in=768, n_out=150, dropout=0.25)
56
+ (pair_mlp_g): MLP(n_in=768, n_out=150, dropout=0.25)
57
+ (sib_attn): Triaffine(n_in=150, bias_x=True, bias_y=True)
58
+ (cop_attn): Triaffine(n_in=150, bias_x=True, bias_y=True)
59
+ (grd_attn): Triaffine(n_in=150, bias_x=True, bias_y=True)
60
+ (inference): SemanticDependencyMFVI(max_iter=3)
61
+ )
62
+
63
+ 2025-10-28 03:25:49 INFO Loading the data
64
+ 2025-10-28 03:25:52 INFO
65
+ train: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
66
+ dev: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
67
+ test: Dataset(n_sentences=2112, n_batches=32, n_buckets=32)
68
+
69
+ 2025-10-28 03:25:52 INFO Epoch 1 / 5000: