iko-01 commited on
Commit
ae0c1ea
·
verified ·
1 Parent(s): 6cdd990

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -3
README.md CHANGED
@@ -1,3 +1,70 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ language:
4
+ - en
5
+ license: mit
6
+ base_model: gpt2
7
+ tags:
8
+ - text-generation
9
+ - gpt2
10
+ - cosmopedia
11
+ - educational
12
+ - synthetic-data
13
+ model_name: CosmoGPT2-Mini
14
+ datasets:
15
+ - Dhiraj45/cosmopedia-v2
16
+ metrics:
17
+ - loss
18
+ ---
19
+
20
+ # CosmoGPT2-Mini 🚀
21
+
22
+ ## Description
23
+ **CosmoGPT2-Mini** is a fine-tuned version of the classic [GPT-2](https://huggingface.co/gpt2) model. It has been trained on a subset of the **Cosmopedia v2** dataset, which consists of synthetic textbooks, blog posts, and educational content.
24
+
25
+ The goal of this model is to adapt GPT-2's capabilities to generate more informative and educational-style text compared to the base model.
26
+
27
+ ## Model Details
28
+ - **Developed by:** [younes MA]
29
+ - **Model type:** Causal Language Model
30
+ - **Base Model:** GPT-2 (Small)
31
+ - **Language:** English
32
+ - **Training Precision:** `bfloat16` (optimized for stability and speed)
33
+
34
+ ## Training Data
35
+ The model was trained on **30,000 samples** from the `Dhiraj45/cosmopedia-v2` dataset. This dataset is known for its high-quality synthetic data covering various academic and general knowledge topics.
36
+
37
+ ## Training Hyperparameters
38
+ - **Epochs:** 1
39
+ - **Max Steps:** 1000
40
+ - **Batch Size:** 2 (with Gradient Accumulation Steps: 8)
41
+ - **Learning Rate:** 5e-5
42
+ - **Optimizer:** AdamW (fused)
43
+ - **Precision:** `bf16`
44
+ - **Max Sequence Length:** 512 tokens
45
+
46
+ ## How to use
47
+ You can use this model directly with a pipeline for text generation:
48
+
49
+ ```python
50
+ from transformers import pipeline
51
+
52
+ generator = pipeline("text-generation", model="iko-01//CosmoGPT2-Mini")
53
+ prompt = "The concept of gravity can be explained as"
54
+ result = generator(prompt, max_length=100, num_return_sequences=1)
55
+
56
+ print(result[0]['generated_text'])
57
+ ```
58
+
59
+ ## Intended Use & Limitations
60
+ - **Intended Use:** Experimental purposes, educational text generation, and studying fine-tuning on synthetic data.
61
+ - **Limitations:** Since this is a small version (GPT-2) and trained on a limited subset (30k samples), it may still generate hallucinations or repetitive text. It is not intended for production-level academic advice.
62
+
63
+ ## Training Results
64
+ The model was trained on a T4 GPU (or equivalent) using optimized settings.
65
+ - **Final Training Loss:** [2.837890]
66
+ - **Evaluation Loss:** [2.686130]
67
+
68
+ ---
69
+ **Note:** This model is part of a training experiment using the Cosmopedia dataset.
70
+ ```