kejian
/

cpsc-debug

English

Generated from Trainer

Model card Files Files and versions

xet

Community

kejian commited on Feb 27, 2023

Commit

9cabb4f

1 Parent(s): 0b439d9

update model card README.md

Browse files

Files changed (1) hide show

README.md +22 -7

README.md CHANGED Viewed

@@ -82,7 +82,7 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.01
-- training_steps: 45776
 - mixed_precision_training: Native AMP
 ### Framework versions
@@ -156,13 +156,28 @@ The following hyperparameters were used during training:
                                                           'top_k': 0,
                                                           'top_p': 0.9},
                                       'name': 'unconditional',
-                                      'num_samples': 512,
-                                      'prefix': '<|aligned|>'}],
                 'scorer_config': {'device': 'cuda:0'}},
  'kl_gpt3_callback': {'force_call_on': [22888],
                       'gpt3_kwargs': {'model_name': 'davinci'},
                       'max_tokens': 64,
-                      'num_samples': 4096,
                       'prefix': '<|aligned|>',
                       'should_insert_prefix': True},
  'model': {'from_scratch': True,
@@ -184,8 +199,8 @@ The following hyperparameters were used during training:
               'hub_strategy': 'all_checkpoints',
               'learning_rate': 0.0005,
               'logging_first_step': True,
-              'logging_steps': 1,
-              'num_tokens': 3000000000.0,
               'output_dir': 'training_output_2',
               'per_device_train_batch_size': 8,
               'push_to_hub': True,
@@ -197,4 +212,4 @@ The following hyperparameters were used during training:
               'weight_decay': 0.1}}
 # Wandb URL:
-https://wandb.ai/kejian/uncategorized/runs/1llp96zs

 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.01
+- training_steps: 42724
 - mixed_precision_training: Native AMP
 ### Framework versions
                                                           'top_k': 0,
                                                           'top_p': 0.9},
                                       'name': 'unconditional',
+                                      'num_samples': 2560,
+                                      'prefix': '<|aligned|>'},
+                                     {'generate_kwargs': {'bad_words_ids': [[50257],
+                                                                            [50258],
+                                                                            [50259],
+                                                                            [50260]],
+                                                          'do_sample': True,
+                                                          'max_length': 128,
+                                                          'min_length': 10,
+                                                          'temperature': 0.7,
+                                                          'top_k': 0,
+                                                          'top_p': 0.9},
+                                      'name': 'challenging_rtp',
+                                      'num_samples': 1024,
+                                      'prefix': '<|aligned|>',
+                                      'prompt_before_control': True,
+                                      'prompts_path': 'resources/challenging_rtp.jsonl'}],
                 'scorer_config': {'device': 'cuda:0'}},
  'kl_gpt3_callback': {'force_call_on': [22888],
                       'gpt3_kwargs': {'model_name': 'davinci'},
                       'max_tokens': 64,
+                      'num_samples': 1024,
                       'prefix': '<|aligned|>',
                       'should_insert_prefix': True},
  'model': {'from_scratch': True,
               'hub_strategy': 'all_checkpoints',
               'learning_rate': 0.0005,
               'logging_first_step': True,
+              'logging_steps': 500,
+              'num_tokens': 2800000000.0,
               'output_dir': 'training_output_2',
               'per_device_train_batch_size': 8,
               'push_to_hub': True,
               'weight_decay': 0.1}}
 # Wandb URL:
+https://wandb.ai/kejian/uncategorized/runs/2296ywzg