| --- |
| license: cc-by-nc-2.0 |
| --- |
| |
|  |
|
|
| by David, Fernando and Eric |
|
|
| Join our Discord! https://discord.gg/vT3sktQ3zb |
|
|
| An experimentation regarding 'lasering' each expert to denoise and enhance model capabilities. |
|
|
| This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score). |
|
|
| Used models (all lasered using laserRMT, except for the base model): |
|
|
|
|
| *mlabonne/Marcoro14-7B-slerp (base) |
| |
| *cognitivecomputations/dolphin-2.6-mistral-7b-dpo |
|
|
| *beowolx/CodeNinja-1.0-OpenChat-7B |
| |
| *Q-bert/MetaMath-Cybertron-Starling |
|
|
| *WizardLM/WizardMath-7B-V1.1 |
| |
| It follows the implementation of laserRMT @ https://github.com/cognitivecomputations/laserRMT |
| |
| Here, we are controlling layers checking which ones have lower signal to noise ratios (which are more subject to noise), to apply Laser interventions, still using Machenko Pastur to calculate this ratio. |
| |
| We intend to be the first of a family of experimentations being carried out @ Cognitive Computations. |
| |
| In this experiment we have observed very high truthfulness and high reasoning capabilities. |
| |