Update app.py
Browse files
app.py
CHANGED
|
@@ -231,7 +231,7 @@ with gr.Blocks(theme=gr.themes.Base(), css=simplified_css) as demo:
|
|
| 231 |
---
|
| 232 |
|
| 233 |
### 1. Introduction & Experimental Setup
|
| 234 |
-
The objective of this study was to evaluate and optimize the zero-shot reasoning capabilities of a Small Language Model (
|
| 235 |
|
| 236 |
* **Dataset:** The CAIS/MMLU (Massive Multitask Language Understanding) benchmark, specifically utilizing randomized validation splits across diverse academic disciplines.
|
| 237 |
* **Methodology:** We compared traditional heuristic prompt engineering methods against a dynamic, model-agnostic routing framework that switches between standard token generation and sequence likelihood evaluation (Perplexity).
|
|
|
|
| 231 |
---
|
| 232 |
|
| 233 |
### 1. Introduction & Experimental Setup
|
| 234 |
+
The objective of this study was to evaluate and optimize the zero-shot reasoning capabilities of a Small Language Model (google/gemma-4-E2B) on multiple-choice question answering.
|
| 235 |
|
| 236 |
* **Dataset:** The CAIS/MMLU (Massive Multitask Language Understanding) benchmark, specifically utilizing randomized validation splits across diverse academic disciplines.
|
| 237 |
* **Methodology:** We compared traditional heuristic prompt engineering methods against a dynamic, model-agnostic routing framework that switches between standard token generation and sequence likelihood evaluation (Perplexity).
|