updated with stories train data

Browse files

Files changed (5) hide show

README.md +36 -38
config.json +1 -2
generation_config.json +1 -1
model.safetensors +1 -1
optimizer.pth → training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,57 +1,55 @@
 ---
 library_name: transformers
-license: apache-2.0
 ---
-# 🧠 JAT-GPT: Just Another Tiny GPT
-Welcome to **JAT-GPT**, the world's most underwhelming large language model — clocking in at a mighty **74 million parameters** (yes, million, not billion — stop laughing).
-## 📦 Model Details
-- **Model type**: GPT2-based decoder-only transformer
-- **Architecture**: GPT-2
-- **Library**: Hugging Face 🤗 Transformers
-- **Parameters**: 74 million (size isn't everything... right?)
-- **Training Objective**: Learn to predict the next word — and sometimes even the *right* one!
-- **Pretrained on**: A secret* dataset (*"secret" means the dataset was just some text I could find lying around)
-- **Training Purpose**: Solely educational. Also for flexing on friends who haven’t trained a language model from scratch.
-## 🚀 Capabilities
-- Can generate small sentences
-  - "Please lower your expectations."
-- Can hallucinate confidently, but in a very short and polite way.
-- Can generate random words after few tokens.
-## 🙅 Limitations
-- Not very smart.
-- Only Pretrained.
-- Understands context... if it fits within few tokens.
-- Cannot replace ChatGPT. (But look how cute it is!)
-## 🤷 Why Train This?
-> "Because I could." – :-)
-- To understand the internals of language modeling.
-- To cry less when training real models later.
-- To appreciate just how powerful modern LLMs are by comparison.
-## 🛠️ Usage
-```python
-# Load model directly
-from transformers import AutoTokenizer, AutoModelForCausalLM
-tokenizer = AutoTokenizer.from_pretrained("itsme-nishanth/JAT-GPT")
-model = AutoModelForCausalLM.from_pretrained("itsme-nishanth/JAT-GPT")
-input_ids = tokenizer.encode("Hi there,", return_tensors="pt")
-output = model.generate(input_ids, max_length=20, do_sample=True)
-print(tokenizer.decode(output[0]))
-# Use a pipeline as a high-level helper
-from transformers import pipeline
-pipe = pipeline("text-generation", model="itsme-nishanth/JAT-GPT")

 ---
 library_name: transformers
+license: mit
+base_model: gpt2
+tags:
+- generated_from_trainer
+model-index:
+- name: JAT-GPT2-trainer
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# JAT-GPT2-trainer
+This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 500
+- num_epochs: 10
+- mixed_precision_training: Native AMP
+### Training results
+### Framework versions
+- Transformers 4.53.2
+- Pytorch 2.6.0+cu124
+- Datasets 4.0.0
+- Tokenizers 0.21.2

config.json CHANGED Viewed

@@ -1,5 +1,4 @@
 {
-  "_name_or_path": "itsme-nishanth/JAT-GPT",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"
@@ -33,7 +32,7 @@
     }
   },
   "torch_dtype": "float32",
-  "transformers_version": "4.47.1",
   "use_cache": true,
   "vocab_size": 50257
 }

 {
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"
     }
   },
   "torch_dtype": "float32",
+  "transformers_version": "4.53.2",
   "use_cache": true,
   "vocab_size": 50257
 }

generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
-  "transformers_version": "4.47.1"
 }

   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
+  "transformers_version": "4.53.2"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6a299ba7ef71d2bf15a202ee86470aa5d88dc05117c5571fe9c56a207de1acb9
 size 71475528

 version https://git-lfs.github.com/spec/v1
+oid sha256:19fc462d738ec4f0753b036c27325368b17fd10d32e339cb41fdd1b7f6cec357
 size 71475528

optimizer.pth → training_args.bin RENAMED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:337e38b1fca79700230e72b05b4e7e5df625a9ba37c8a3252400c9ecf1309844
-size 142980858

 version https://git-lfs.github.com/spec/v1
+oid sha256:4b260083aec1104bea74d678c6f37dc3f32586a5c95b740c77adc7d2de070456
+size 5304