Update README.md via script
Browse files
README.md
CHANGED
|
@@ -23,14 +23,14 @@ The fully merged model weights and tokenizer are updated periodically at the roo
|
|
| 23 |
- **Dynamic Dataset Source:** The script iterates through a wide array of Hugging Face Hub datasets.
|
| 24 |
- **Rapid Iteration Strategy:** Training per dataset configuration is brief (`max_steps=1`), prioritizing breadth of exposure over depth on any single dataset.
|
| 25 |
## Training Progress
|
| 26 |
-
- **Datasets Processed (Successfully trained on at least one config):**
|
| 27 |
-
- **Text Examples Streamed (Total):**
|
| 28 |
-
- **Tokens Processed (Total):**
|
| 29 |
-
- **Last Successful Model Update:** 2025-05-08 15:
|
| 30 |
### Evaluation Snapshot (Approximate)
|
| 31 |
|
| 32 |
-
- **Current Perplexity (wikitext Subset):**
|
| 33 |
-
- **Perplexity Change:** `-
|
| 34 |
|
| 35 |
#### Generated Examples (Qualitative Assessment)
|
| 36 |
|
|
@@ -41,7 +41,7 @@ The fully merged model weights and tokenizer are updated periodically at the roo
|
|
| 41 |
| Creative Prompt | `Describe a friendly robot that love...` | `We are pleased to announce the launch of... ` |
|
| 42 |
| Question Answering (Basic) | `What is the main color of a ripe ba...` | `As an example we've been using the same ... ` |
|
| 43 |
| Code Generation (Simple Python) | `Write a Python function that takes ...` | `We are looking forward to seeing us in t... ` |
|
| 44 |
-
| Reasoning (Simple) | `If a train leaves station A at 10:0...` | `
|
| 45 |
|
| 46 |
#### Standard Benchmarks (via `lighteval`)
|
| 47 |
_Note: Running standard benchmarks requires a dedicated setup using the `lighteval` harness. The table below shows scores if available in `evaluation_stats.json`, otherwise `N/A`._
|
|
|
|
| 23 |
- **Dynamic Dataset Source:** The script iterates through a wide array of Hugging Face Hub datasets.
|
| 24 |
- **Rapid Iteration Strategy:** Training per dataset configuration is brief (`max_steps=1`), prioritizing breadth of exposure over depth on any single dataset.
|
| 25 |
## Training Progress
|
| 26 |
+
- **Datasets Processed (Successfully trained on at least one config):** 5
|
| 27 |
+
- **Text Examples Streamed (Total):** 30
|
| 28 |
+
- **Tokens Processed (Total):** 15360
|
| 29 |
+
- **Last Successful Model Update:** 2025-05-08 15:42:00 UTC
|
| 30 |
### Evaluation Snapshot (Approximate)
|
| 31 |
|
| 32 |
+
- **Current Perplexity (wikitext Subset):** 284.82
|
| 33 |
+
- **Perplexity Change:** `-1.29` ⬇️ (vs previous cycle's perplexity)
|
| 34 |
|
| 35 |
#### Generated Examples (Qualitative Assessment)
|
| 36 |
|
|
|
|
| 41 |
| Creative Prompt | `Describe a friendly robot that love...` | `We are pleased to announce the launch of... ` |
|
| 42 |
| Question Answering (Basic) | `What is the main color of a ripe ba...` | `As an example we've been using the same ... ` |
|
| 43 |
| Code Generation (Simple Python) | `Write a Python function that takes ...` | `We are looking forward to seeing us in t... ` |
|
| 44 |
+
| Reasoning (Simple) | `If a train leaves station A at 10:0...` | `This is a big task force to get ready fo... ` |
|
| 45 |
|
| 46 |
#### Standard Benchmarks (via `lighteval`)
|
| 47 |
_Note: Running standard benchmarks requires a dedicated setup using the `lighteval` harness. The table below shows scores if available in `evaluation_stats.json`, otherwise `N/A`._
|