craa commited on Dec 5, 2025

Commit

f879408

verified ·

1 Parent(s): 267ed01

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

resemble_to_hit_frequency_1001/README.md +172 -0
resemble_to_hit_frequency_1001/all_results.json +16 -0
resemble_to_hit_frequency_1001/checkpoint-100000/config.json +31 -0
resemble_to_hit_frequency_1001/checkpoint-100000/generation_config.json +6 -0
resemble_to_hit_frequency_1001/checkpoint-100000/merges.txt +0 -0
resemble_to_hit_frequency_1001/checkpoint-100000/model.safetensors +3 -0
resemble_to_hit_frequency_1001/checkpoint-100000/optimizer.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-100000/rng_state.pth +3 -0
resemble_to_hit_frequency_1001/checkpoint-100000/scaler.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-100000/scheduler.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-100000/special_tokens_map.json +5 -0
resemble_to_hit_frequency_1001/checkpoint-100000/tokenizer.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-100000/tokenizer_config.json +20 -0
resemble_to_hit_frequency_1001/checkpoint-100000/trainer_state.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-100000/training_args.bin +3 -0
resemble_to_hit_frequency_1001/checkpoint-100000/vocab.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-40000/config.json +31 -0
resemble_to_hit_frequency_1001/checkpoint-40000/generation_config.json +6 -0
resemble_to_hit_frequency_1001/checkpoint-40000/merges.txt +0 -0
resemble_to_hit_frequency_1001/checkpoint-40000/model.safetensors +3 -0
resemble_to_hit_frequency_1001/checkpoint-40000/optimizer.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-40000/rng_state.pth +3 -0
resemble_to_hit_frequency_1001/checkpoint-40000/scaler.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-40000/scheduler.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-40000/special_tokens_map.json +5 -0
resemble_to_hit_frequency_1001/checkpoint-40000/tokenizer.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-40000/tokenizer_config.json +20 -0
resemble_to_hit_frequency_1001/checkpoint-40000/trainer_state.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-40000/training_args.bin +3 -0
resemble_to_hit_frequency_1001/checkpoint-40000/vocab.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-70000/config.json +31 -0
resemble_to_hit_frequency_1001/checkpoint-70000/generation_config.json +6 -0
resemble_to_hit_frequency_1001/checkpoint-70000/merges.txt +0 -0
resemble_to_hit_frequency_1001/checkpoint-70000/model.safetensors +3 -0
resemble_to_hit_frequency_1001/checkpoint-70000/optimizer.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-70000/rng_state.pth +3 -0
resemble_to_hit_frequency_1001/checkpoint-70000/scaler.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-70000/scheduler.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-70000/special_tokens_map.json +5 -0
resemble_to_hit_frequency_1001/checkpoint-70000/tokenizer.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-70000/tokenizer_config.json +20 -0
resemble_to_hit_frequency_1001/checkpoint-70000/trainer_state.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-70000/training_args.bin +3 -0
resemble_to_hit_frequency_1001/checkpoint-70000/vocab.json +0 -0
resemble_to_hit_frequency_1001/checkpoint-80000/config.json +31 -0
resemble_to_hit_frequency_1001/checkpoint-80000/generation_config.json +6 -0
resemble_to_hit_frequency_1001/checkpoint-80000/merges.txt +0 -0
resemble_to_hit_frequency_1001/checkpoint-80000/model.safetensors +3 -0
resemble_to_hit_frequency_1001/checkpoint-80000/optimizer.pt +3 -0
resemble_to_hit_frequency_1001/checkpoint-80000/rng_state.pth +3 -0

resemble_to_hit_frequency_1001/README.md ADDED Viewed

	@@ -0,0 +1,172 @@

+---
+library_name: transformers
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: exceptions_exp2_resemble_to_hit_frequency_1001
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/craaaa/exceptions_exp2/runs/jcuxvn5d)
+# exceptions_exp2_resemble_to_hit_frequency_1001
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.5557
+- Accuracy: 0.3700
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0006
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 1001
+- gradient_accumulation_steps: 5
+- total_train_batch_size: 80
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 50.0
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch   | Step   | Accuracy | Validation Loss |
+|:-------------:|:-------:|:------:|:--------:|:---------------:|
+| 4.8439        | 0.2914  | 1000   | 0.2509   | 4.7752          |
+| 4.329         | 0.5828  | 2000   | 0.3000   | 4.2779          |
+| 4.1511        | 0.8741  | 3000   | 0.3167   | 4.0918          |
+| 3.9927        | 1.1655  | 4000   | 0.3258   | 3.9809          |
+| 3.9166        | 1.4569  | 5000   | 0.3327   | 3.9076          |
+| 3.8693        | 1.7483  | 6000   | 0.3379   | 3.8500          |
+| 3.7327        | 2.0396  | 7000   | 0.3422   | 3.8088          |
+| 3.7487        | 2.3310  | 8000   | 0.3450   | 3.7771          |
+| 3.7285        | 2.6224  | 9000   | 0.3478   | 3.7496          |
+| 3.7212        | 2.9138  | 10000  | 0.3504   | 3.7192          |
+| 3.6257        | 3.2051  | 11000  | 0.3525   | 3.7095          |
+| 3.6324        | 3.4965  | 12000  | 0.3538   | 3.6917          |
+| 3.6403        | 3.7879  | 13000  | 0.3554   | 3.6738          |
+| 3.5471        | 4.0793  | 14000  | 0.3564   | 3.6670          |
+| 3.5627        | 4.3706  | 15000  | 0.3577   | 3.6559          |
+| 3.5772        | 4.6620  | 16000  | 0.3592   | 3.6416          |
+| 3.5756        | 4.9534  | 17000  | 0.3605   | 3.6272          |
+| 3.4948        | 5.2448  | 18000  | 0.3604   | 3.6304          |
+| 3.5348        | 5.5361  | 19000  | 0.3613   | 3.6220          |
+| 3.5111        | 5.8275  | 20000  | 0.3627   | 3.6120          |
+| 3.4294        | 6.1189  | 21000  | 0.3626   | 3.6138          |
+| 3.468         | 6.4103  | 22000  | 0.3633   | 3.6076          |
+| 3.463         | 6.7016  | 23000  | 0.3641   | 3.5994          |
+| 3.4931        | 6.9930  | 24000  | 0.3651   | 3.5879          |
+| 3.4302        | 7.2844  | 25000  | 0.3649   | 3.5963          |
+| 3.4563        | 7.5758  | 26000  | 0.3654   | 3.5880          |
+| 3.4508        | 7.8671  | 27000  | 0.3665   | 3.5766          |
+| 3.3689        | 8.1585  | 28000  | 0.3662   | 3.5876          |
+| 3.409         | 8.4499  | 29000  | 0.3672   | 3.5788          |
+| 3.4213        | 8.7413  | 30000  | 0.3675   | 3.5736          |
+| 3.3257        | 9.0326  | 31000  | 0.3677   | 3.5775          |
+| 3.3722        | 9.3240  | 32000  | 0.3677   | 3.5775          |
+| 3.3982        | 9.6154  | 33000  | 0.3684   | 3.5674          |
+| 3.4083        | 9.9068  | 34000  | 0.3689   | 3.5608          |
+| 3.3294        | 10.1981 | 35000  | 0.3687   | 3.5735          |
+| 3.3591        | 10.4895 | 36000  | 0.3689   | 3.5666          |
+| 3.3894        | 10.7809 | 37000  | 0.3694   | 3.5561          |
+| 3.2944        | 11.0723 | 38000  | 0.3693   | 3.5667          |
+| 3.3381        | 11.3636 | 39000  | 0.3691   | 3.5636          |
+| 3.3591        | 11.6550 | 40000  | 0.3700   | 3.5557          |
+| 3.357         | 11.9464 | 41000  | 0.3706   | 3.5481          |
+| 3.3056        | 12.2378 | 42000  | 0.3699   | 3.5649          |
+| 3.3235        | 12.5291 | 43000  | 0.3705   | 3.5532          |
+| 3.3355        | 12.8205 | 44000  | 0.3705   | 3.5489          |
+| 3.2631        | 13.1119 | 45000  | 0.3701   | 3.5618          |
+| 3.3058        | 13.4033 | 46000  | 0.3707   | 3.5577          |
+| 3.3095        | 13.6946 | 47000  | 0.3713   | 3.5500          |
+| 3.3335        | 13.9860 | 48000  | 0.3718   | 3.5389          |
+| 3.2777        | 14.2774 | 49000  | 0.3712   | 3.5574          |
+| 3.313         | 14.5688 | 50000  | 0.3716   | 3.5501          |
+| 3.3186        | 14.8601 | 51000  | 0.3722   | 3.5429          |
+| 3.2468        | 15.1515 | 52000  | 0.3714   | 3.5522          |
+| 3.2767        | 15.4429 | 53000  | 0.3715   | 3.5495          |
+| 3.2932        | 15.7343 | 54000  | 0.3725   | 3.5390          |
+| 3.2057        | 16.0256 | 55000  | 0.3721   | 3.5518          |
+| 3.2652        | 16.3170 | 56000  | 0.3720   | 3.5508          |
+| 3.2886        | 16.6084 | 57000  | 0.3725   | 3.5458          |
+| 3.2946        | 16.8998 | 58000  | 0.3731   | 3.5339          |
+| 3.2236        | 17.1911 | 59000  | 0.3719   | 3.5545          |
+| 3.268         | 17.4825 | 60000  | 0.3725   | 3.5442          |
+| 3.2783        | 17.7739 | 61000  | 0.3730   | 3.5379          |
+| 3.1921        | 18.0653 | 62000  | 0.3721   | 3.5566          |
+| 3.2344        | 18.3566 | 63000  | 0.3727   | 3.5482          |
+| 3.2715        | 18.6480 | 64000  | 0.3732   | 3.5396          |
+| 3.2683        | 18.9394 | 65000  | 0.3736   | 3.5344          |
+| 3.2061        | 19.2308 | 66000  | 0.3728   | 3.5469          |
+| 3.2432        | 19.5221 | 67000  | 0.3731   | 3.5429          |
+| 3.2545        | 19.8135 | 68000  | 0.3736   | 3.5355          |
+| 3.1751        | 20.1049 | 69000  | 0.3732   | 3.5499          |
+| 3.2055        | 20.3963 | 70000  | 0.3729   | 3.5491          |
+| 3.2364        | 20.6876 | 71000  | 0.3736   | 3.5382          |
+| 3.2603        | 20.9790 | 72000  | 0.3741   | 3.5327          |
+| 3.1853        | 21.2704 | 73000  | 0.3733   | 3.5471          |
+| 3.2229        | 21.5618 | 74000  | 0.3736   | 3.5428          |
+| 3.2503        | 21.8531 | 75000  | 0.3743   | 3.5331          |
+| 3.1733        | 22.1445 | 76000  | 0.3730   | 3.5514          |
+| 3.1996        | 22.4359 | 77000  | 0.3737   | 3.5417          |
+| 3.2241        | 22.7273 | 78000  | 0.3742   | 3.5335          |
+| 3.1207        | 23.0186 | 79000  | 0.3738   | 3.5484          |
+| 3.1867        | 23.3100 | 80000  | 0.3737   | 3.5505          |
+| 3.1736        | 23.6014 | 81000  | 3.5509   | 0.3738          |
+| 3.1804        | 23.8928 | 82000  | 3.5459   | 0.3740          |
+| 3.166         | 24.1841 | 83000  | 3.5539   | 0.3734          |
+| 3.193         | 24.4755 | 84000  | 3.5468   | 0.3740          |
+| 3.2055        | 24.7669 | 85000  | 3.5372   | 0.3745          |
+| 3.125         | 25.0583 | 86000  | 3.5497   | 0.3741          |
+| 3.1831        | 25.3497 | 87000  | 3.5500   | 0.3737          |
+| 3.1967        | 25.6410 | 88000  | 3.5418   | 0.3745          |
+| 3.198         | 25.9324 | 89000  | 3.5323   | 0.3752          |
+| 3.1476        | 26.2238 | 90000  | 3.5518   | 0.3741          |
+| 3.1699        | 26.5152 | 91000  | 3.5434   | 0.3745          |
+| 3.1932        | 26.8065 | 92000  | 3.5361   | 0.3746          |
+| 3.129         | 27.0979 | 93000  | 3.5533   | 0.3737          |
+| 3.1585        | 27.3893 | 94000  | 3.5470   | 0.3741          |
+| 3.1692        | 27.6807 | 95000  | 3.5368   | 0.3751          |
+| 3.2099        | 27.9720 | 96000  | 3.5339   | 0.3755          |
+| 3.1312        | 28.2634 | 97000  | 3.5470   | 0.3748          |
+| 3.1501        | 28.5548 | 98000  | 3.5463   | 0.3748          |
+| 3.1644        | 28.8462 | 99000  | 3.5356   | 0.3749          |
+| 3.1114        | 29.1375 | 100000 | 3.5504   | 0.3742          |
+| 3.1421        | 29.4289 | 101000 | 3.5466   | 0.3746          |
+| 3.1544        | 29.7203 | 102000 | 3.5421   | 0.3751          |
+| 3.1002        | 30.0117 | 103000 | 3.5492   | 0.3748          |
+| 3.1165        | 30.3030 | 104000 | 3.5489   | 0.3747          |
+| 3.1392        | 30.5944 | 105000 | 3.5429   | 0.3749          |
+| 3.1409        | 30.8858 | 106000 | 3.5419   | 0.3754          |
+| 3.101         | 31.1772 | 107000 | 3.5554   | 0.3743          |
+| 3.1215        | 31.4685 | 108000 | 3.5474   | 0.3751          |
+| 3.1317        | 31.7599 | 109000 | 3.5389   | 0.3754          |
+### Framework versions
+- Transformers 4.55.2
+- Pytorch 2.8.0+cu128
+- Datasets 4.0.0
+- Tokenizers 0.21.4

resemble_to_hit_frequency_1001/all_results.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+    "epoch": 31.75990675990676,
+    "eval_accuracy": 0.37000906155199714,
+    "eval_loss": 3.5557005405426025,
+    "eval_runtime": 179.2033,
+    "eval_samples": 16642,
+    "eval_samples_per_second": 92.867,
+    "eval_steps_per_second": 5.809,
+    "perplexity": 35.01233894134304,
+    "total_flos": 2.278434118828032e+18,
+    "train_loss": 0.8388565721424348,
+    "train_runtime": 57880.4577,
+    "train_samples": 274556,
+    "train_samples_per_second": 237.175,
+    "train_steps_per_second": 2.965
+}

resemble_to_hit_frequency_1001/checkpoint-100000/config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.55.2",
+  "use_cache": true,
+  "vocab_size": 50257
+}

resemble_to_hit_frequency_1001/checkpoint-100000/generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.55.2"
+}

resemble_to_hit_frequency_1001/checkpoint-100000/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-100000/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d706b47c9f6b93aeff172dc9cc98d1d4f02e428548e1cd6d2f9f5d971c15f718
+size 497774208

resemble_to_hit_frequency_1001/checkpoint-100000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ac12ee3ae920171644193b3ae18c9546fc29124474764a564f919bd3871516a3
+size 995644811

resemble_to_hit_frequency_1001/checkpoint-100000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0b065933488c1882c9d600b7066a2c84b6d692078b4e5fd84a9b02c8988166aa
+size 14709

resemble_to_hit_frequency_1001/checkpoint-100000/scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e271799a9390b95549db5f989a0b1f5db31c050fe1eed3dcacd98c670ca2acea
+size 1383

resemble_to_hit_frequency_1001/checkpoint-100000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9d9ce937f1a86b552d6453748b26fd9569a479e68a1ff7fbfe56babfb657289b
+size 1465

resemble_to_hit_frequency_1001/checkpoint-100000/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}

resemble_to_hit_frequency_1001/checkpoint-100000/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-100000/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "extra_special_tokens": {},
+  "model_max_length": 1024,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

resemble_to_hit_frequency_1001/checkpoint-100000/trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-100000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5f3b5e564131c6bd973fd19100d8e638e97bf255d9fe8247f1f0bef2f37d2a7d
+size 5969

resemble_to_hit_frequency_1001/checkpoint-100000/vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-40000/config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.55.2",
+  "use_cache": true,
+  "vocab_size": 50257
+}

resemble_to_hit_frequency_1001/checkpoint-40000/generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.55.2"
+}

resemble_to_hit_frequency_1001/checkpoint-40000/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-40000/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f3c2723b6125fb5aea4a034fec227039f3fa7a7c981bb7c17201065bb277f99b
+size 497774208

resemble_to_hit_frequency_1001/checkpoint-40000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1546ea1d34cbb97087958f44fb089e52cbc76a68e19e5c8d65f3031003b31de6
+size 995644811

resemble_to_hit_frequency_1001/checkpoint-40000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e33aab7f6b7fd686a9a8c9ed9227f8deba41aefd7921647aed83a98b95906ae
+size 14709

resemble_to_hit_frequency_1001/checkpoint-40000/scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6b77932b66184e2e46043956fd77fe7e31579f2f10c911de298adc95f1a94147
+size 1383

resemble_to_hit_frequency_1001/checkpoint-40000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:03ddff66d7e805d7abc492e4a794238902809431f2520d6bac1160d49bd811f9
+size 1465

resemble_to_hit_frequency_1001/checkpoint-40000/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}

resemble_to_hit_frequency_1001/checkpoint-40000/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-40000/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "extra_special_tokens": {},
+  "model_max_length": 1024,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

resemble_to_hit_frequency_1001/checkpoint-40000/trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-40000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:da74df075fd423c738522d5437b215c7a612efc739a944adeda36d7244347510
+size 5969

resemble_to_hit_frequency_1001/checkpoint-40000/vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-70000/config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.55.2",
+  "use_cache": true,
+  "vocab_size": 50257
+}

resemble_to_hit_frequency_1001/checkpoint-70000/generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.55.2"
+}

resemble_to_hit_frequency_1001/checkpoint-70000/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-70000/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d1ef750ab30528c76883b09a47405a5df000a511f7675cf003afc64f2cdc208d
+size 497774208

resemble_to_hit_frequency_1001/checkpoint-70000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dfdf8d2176aae8c294c749ba90bcab11b1f0b2ccde027fe170c5c2321a173402
+size 995644811

resemble_to_hit_frequency_1001/checkpoint-70000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:61c2394a8b93464d76918bc7ce9b4fe0f1d8b46188cccb11784463056ac5e2ab
+size 14709

resemble_to_hit_frequency_1001/checkpoint-70000/scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:38bed55f9b494feee1dbae64700fd9e7d6f8f078f397f019f21d448e03599f58
+size 1383

resemble_to_hit_frequency_1001/checkpoint-70000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3399e666571bf5876d26a27563f0f173f6b152d0de761286960316a967cb8949
+size 1465

resemble_to_hit_frequency_1001/checkpoint-70000/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}

resemble_to_hit_frequency_1001/checkpoint-70000/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-70000/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "extra_special_tokens": {},
+  "model_max_length": 1024,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

resemble_to_hit_frequency_1001/checkpoint-70000/trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-70000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:da74df075fd423c738522d5437b215c7a612efc739a944adeda36d7244347510
+size 5969

resemble_to_hit_frequency_1001/checkpoint-70000/vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-80000/config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.55.2",
+  "use_cache": true,
+  "vocab_size": 50257
+}

resemble_to_hit_frequency_1001/checkpoint-80000/generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.55.2"
+}

resemble_to_hit_frequency_1001/checkpoint-80000/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

resemble_to_hit_frequency_1001/checkpoint-80000/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4c275a5b72dc95bcfa2fdd46d715fa5b8bcd02f506efc8ae3583e5c884c05ca2
+size 497774208

resemble_to_hit_frequency_1001/checkpoint-80000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bba4e5a59763de2021c9330fe9ad98c87bbcc8b8f0623a146bd781d60b2af20b
+size 995644811

resemble_to_hit_frequency_1001/checkpoint-80000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:99a352814df19583935ac7e4bdbcb64054e676c7890eb95920e85653105eea6e
+size 14709