Update README.md
Browse files
README.md
CHANGED
|
@@ -4,35 +4,35 @@ tags:
|
|
| 4 |
- mergekit
|
| 5 |
- lazymergekit
|
| 6 |
- abacaj/phi-2-super
|
| 7 |
-
- abacaj/phi-2-super
|
| 8 |
-
- abacaj/phi-2-super
|
| 9 |
-
- abacaj/phi-2-super
|
| 10 |
-
- abacaj/phi-2-super
|
| 11 |
-
- abacaj/phi-2-super
|
| 12 |
-
- abacaj/phi-2-super
|
| 13 |
-
- abacaj/phi-2-super
|
| 14 |
base_model:
|
| 15 |
- abacaj/phi-2-super
|
| 16 |
-
- abacaj/phi-2-super
|
| 17 |
-
- abacaj/phi-2-super
|
| 18 |
-
- abacaj/phi-2-super
|
| 19 |
-
- abacaj/phi-2-super
|
| 20 |
-
- abacaj/phi-2-super
|
| 21 |
-
- abacaj/phi-2-super
|
| 22 |
-
- abacaj/phi-2-super
|
| 23 |
-
---
|
| 24 |
|
| 25 |
# phi-2-DLEC
|
| 26 |
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
|
| 37 |
## 🧩 Configuration
|
| 38 |
|
|
|
|
| 4 |
- mergekit
|
| 5 |
- lazymergekit
|
| 6 |
- abacaj/phi-2-super
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
base_model:
|
| 8 |
- abacaj/phi-2-super
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
# phi-2-DLEC
|
| 11 |
|
| 12 |
+
The DLEC (Distributive Layer Expansion Curve) methodology offers a novel approach to improving neural network models by focusing on the strategic duplication of certain effective layers.
|
| 13 |
+
Developed with the aim of enhancing model performance, DLEC carefully identifies and amplifies the impact of key layers within the model's architecture.
|
| 14 |
+
|
| 15 |
+
Below is a overview of the method and its implementation, particularly in how it integrates with the Hugging Face Transformers library and utilizes PyTorch and BitsAndBytes for efficient operation.
|
| 16 |
+
|
| 17 |
+
Overview
|
| 18 |
+
Setting Up: First, the script ensures all necessary components are in place, from libraries to the model and dataset.
|
| 19 |
+
|
| 20 |
+
Database for Activations: A SQLite database is established to track layer activations, providing a clear view into how individual neurons react and which layers are most influential — these are our 'beneficial layers.'
|
| 21 |
+
|
| 22 |
+
Analyzing and Identifying: By analyzing activation data, the script pinpoints which layers are most valuable to the model's performance.
|
| 23 |
+
|
| 24 |
+
Configuring DLEC: A configuration is then created, guiding how the model should incorporate duplicates of these beneficial layers to boost effectiveness without unnecessarily increasing complexity.
|
| 25 |
+
|
| 26 |
+
Reconfiguring and Running the Model: Finally, the model is adjusted according to DLEC's insights, focusing enhancement on the identified layers.
|
| 27 |
+
|
| 28 |
+
Key Features:
|
| 29 |
+
Selective Layer Duplication: DLEC doesn't just add more layers; it doubles down on the ones that really matter. This methodical selection ensures we're making the most of the model's capabilities without wasteful expansion.
|
| 30 |
+
|
| 31 |
+
Smart Resource Management: By honing in on specific areas for improvement, DLEC aims to make better use of computational and memory resources, promoting more efficient learning without adding undue complexity to the model.
|
| 32 |
+
|
| 33 |
+
This approach is about making informed, strategic enhancements to model architecture, prioritizing efficiency and effectiveness in utilizing neural network capabilities.
|
| 34 |
+
|
| 35 |
+
# This Method is still in development and I do not expect "Game Changing" or will I oversell this method, it is purely done for fun. Please let me know how the model works for you.
|
| 36 |
|
| 37 |
## 🧩 Configuration
|
| 38 |
|