Update README.md
Browse files
README.md
CHANGED
|
@@ -31,7 +31,7 @@ The model was evaluated at step **79,100**. The final validation metrics indicat
|
|
| 31 |
|
| 32 |
| Metric | Value | Description |
|
| 33 |
| :--- | :--- | :--- |
|
| 34 |
-
| **Step** |
|
| 35 |
| **Train Loss** | 3.2190 | Total training loss (CE + Aux) |
|
| 36 |
| **Train CE** | 3.0987 | Cross-Entropy loss on training data |
|
| 37 |
| **Val Loss** | 3.2028 | Total validation loss |
|
|
@@ -43,6 +43,25 @@ The model was evaluated at step **79,100**. The final validation metrics indicat
|
|
| 43 |
|
| 44 |

|
| 45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
## 🛠️ Repository Contents
|
| 47 |
|
| 48 |
This repository contains checkpoints compatible with both major frameworks:
|
|
@@ -76,5 +95,5 @@ If you find this model or the associated research useful, please cite:
|
|
| 76 |
year = {2025},
|
| 77 |
publisher = {Hugging Face},
|
| 78 |
journal = {Hugging Face Repository},
|
| 79 |
-
howpublished = {\url{[https://huggingface.co/
|
| 80 |
}
|
|
|
|
| 31 |
|
| 32 |
| Metric | Value | Description |
|
| 33 |
| :--- | :--- | :--- |
|
| 34 |
+
| **Step** | 79,100 | Total training steps |
|
| 35 |
| **Train Loss** | 3.2190 | Total training loss (CE + Aux) |
|
| 36 |
| **Train CE** | 3.0987 | Cross-Entropy loss on training data |
|
| 37 |
| **Val Loss** | 3.2028 | Total validation loss |
|
|
|
|
| 43 |
|
| 44 |

|
| 45 |
|
| 46 |
+
## 📝 Generation Example
|
| 47 |
+
|
| 48 |
+
The following example demonstrates the model's generation capabilities after training.
|
| 49 |
+
|
| 50 |
+
**Prompt:**
|
| 51 |
+
> Many software systems become difficult to maintain not because the problems are hard, but because unnecessary complexity accumulates over time. Extra abstractions, premature optimizations, and unclear design choices often make systems fragile.
|
| 52 |
+
> Experienced engineers tend to favor simple designs that are easy to understand, test, and evolve. Simplicity reflects clarity of thought and strong fundamentals rather than lack of sophistication.
|
| 53 |
+
|
| 54 |
+
**Model Prediction:**
|
| 55 |
+
> is a key design feature in many software systems, from the ubiquitous Open Source Unix to the free software system to the proprietary Java APIs that allow a developer to do any other job at work.
|
| 56 |
+
>
|
| 57 |
+
> While this is an obvious disadvantage for software development, the reality is that there are many aspects of software that are highly important to a programmer's day-to-day life. This is why even a moderately experienced programmer should never be concerned about this.
|
| 58 |
+
>
|
| 59 |
+
> The best way to learn about your code is by going through its source code. That way, it's always safe to do something new when writing code. This gives your programmer freedom and confidence.
|
| 60 |
+
>
|
| 61 |
+
> One of the most popular techniques for coding small functions in your computer is "code reuse." The same technique can be used by programmers in any number of different ways. Some programmers might write code to get the job done, and others develop it to get it to the end. They use the same tools as most programmers to get the job done.
|
| 62 |
+
>
|
| 63 |
+
> The best way to learn about your code is by going through its source code. That way, it's always safe to do something new when writing code. That technique can help you get the job done. It provides a way to write code that is easy to understand and maintain, and makes debugging easier.
|
| 64 |
+
|
| 65 |
## 🛠️ Repository Contents
|
| 66 |
|
| 67 |
This repository contains checkpoints compatible with both major frameworks:
|
|
|
|
| 95 |
year = {2025},
|
| 96 |
publisher = {Hugging Face},
|
| 97 |
journal = {Hugging Face Repository},
|
| 98 |
+
howpublished = {\url{[https://huggingface.co/QuarkML/Q-MoE-400](https://huggingface.co/QuarkML/Q-MoE-400)}}
|
| 99 |
}
|