Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
|
@@ -79,11 +79,11 @@ def compare_models(text, direction):
|
|
| 79 |
# 4. UI & REVISED RESEARCH REPORT
|
| 80 |
# ==========================================
|
| 81 |
with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
| 82 |
-
gr.Markdown("# ⚛️ Physics-
|
| 83 |
|
| 84 |
with gr.Tabs():
|
| 85 |
with gr.TabItem("Comparison Demo"):
|
| 86 |
-
gr.Markdown("Direct comparison: **Base Llama-3-8B** vs. **Physics-
|
| 87 |
|
| 88 |
with gr.Row():
|
| 89 |
with gr.Column():
|
|
@@ -96,7 +96,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
|
| 96 |
gr.Markdown("### 📉 Base Llama-3-8B")
|
| 97 |
out_base = gr.Textbox(label="General Purpose Output", lines=6, interactive=False)
|
| 98 |
with gr.Column():
|
| 99 |
-
gr.Markdown("### 🚀 Physics-
|
| 100 |
out_cft = gr.Textbox(label="Domain-Refined Output", lines=6, interactive=False)
|
| 101 |
|
| 102 |
btn.click(fn=compare_models, inputs=[input_text, direction], outputs=[out_base, out_cft])
|
|
@@ -106,7 +106,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
|
| 106 |
## Methodology: Zero-Shot Domain Adaptation via Anchored CPT
|
| 107 |
|
| 108 |
### **Training Objective**
|
| 109 |
-
This model was developed to achieve specialized domain translation in Physics using **Continued Pre-Training (CPT)** on independent monolingual manifolds (5,000 ArXiv EN abstracts / 5,000 Wiki DE articles). No parallel domain-specific corpora were utilized.
|
| 110 |
|
| 111 |
### **Quantization vs. Adapter Precision**
|
| 112 |
It is important to note a performance distinction between the **full-precision LoRA adapter** and this **quantized GGUF deployment**:
|
|
@@ -114,20 +114,20 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
|
| 114 |
- **GGUF Deployment:** The 4-bit quantization (Q4_K_M) required for efficient CPU deployment introduces a slight probabilistic "blurring." In the demo above, the model may select a "safer" technical term (e.g., ***'rückläufige'***) rather than the most aggressive jargon.
|
| 115 |
|
| 116 |
### **Persistent Domain Traces**
|
| 117 |
-
Despite quantization, the Physics-
|
| 118 |
|
| 119 |
1. **Rejection of Hallucinations:**
|
| 120 |
- *Input:* "Ground state degeneracy"
|
| 121 |
- *Base Model:* Produces **"Degenerenz"** (A linguistic hallucination; a non-existent German word).
|
| 122 |
-
- *Physics-
|
| 123 |
|
| 124 |
2. **Technical Adjective Selection:**
|
| 125 |
- *Input:* "Reverse shock wave"
|
| 126 |
- *Base Model:* Uses **"rückwärtige"** (A casual, general-purpose word for 'at the back').
|
| 127 |
-
- *Physics-
|
| 128 |
|
| 129 |
### **Conclusion**
|
| 130 |
-
These results validate the **Semantic Triangulation** hypothesis. By aligning the "Functional Bridge" (Anchor) with "Domain Knowledge" (Monolingual CPT), the model shifts its internal probability away from colloquial "guesses" and toward authentic scientific vocabulary, even under the constraints of 4-bit quantization.
|
| 131 |
""")
|
| 132 |
|
| 133 |
if __name__ == "__main__":
|
|
|
|
| 79 |
# 4. UI & REVISED RESEARCH REPORT
|
| 80 |
# ==========================================
|
| 81 |
with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
| 82 |
+
gr.Markdown("# ⚛️ Physics-ACPT: Domain-Adaptive Translation")
|
| 83 |
|
| 84 |
with gr.Tabs():
|
| 85 |
with gr.TabItem("Comparison Demo"):
|
| 86 |
+
gr.Markdown("Direct comparison: **Base Llama-3-8B** vs. **Physics-ACPT** (Unsupervised Adaptation).")
|
| 87 |
|
| 88 |
with gr.Row():
|
| 89 |
with gr.Column():
|
|
|
|
| 96 |
gr.Markdown("### 📉 Base Llama-3-8B")
|
| 97 |
out_base = gr.Textbox(label="General Purpose Output", lines=6, interactive=False)
|
| 98 |
with gr.Column():
|
| 99 |
+
gr.Markdown("### 🚀 Physics-ACPT (ACPT Mode)")
|
| 100 |
out_cft = gr.Textbox(label="Domain-Refined Output", lines=6, interactive=False)
|
| 101 |
|
| 102 |
btn.click(fn=compare_models, inputs=[input_text, direction], outputs=[out_base, out_cft])
|
|
|
|
| 106 |
## Methodology: Zero-Shot Domain Adaptation via Anchored CPT
|
| 107 |
|
| 108 |
### **Training Objective**
|
| 109 |
+
This model was developed to achieve specialized domain translation in Physics using solo **Continued Pre-Training (CPT)** on independent monolingual manifolds (5,000 ArXiv EN abstracts / 5,000 Wiki DE articles). No parallel domain-specific corpora were utilized.
|
| 110 |
|
| 111 |
### **Quantization vs. Adapter Precision**
|
| 112 |
It is important to note a performance distinction between the **full-precision LoRA adapter** and this **quantized GGUF deployment**:
|
|
|
|
| 114 |
- **GGUF Deployment:** The 4-bit quantization (Q4_K_M) required for efficient CPU deployment introduces a slight probabilistic "blurring." In the demo above, the model may select a "safer" technical term (e.g., ***'rückläufige'***) rather than the most aggressive jargon.
|
| 115 |
|
| 116 |
### **Persistent Domain Traces**
|
| 117 |
+
Despite quantization, the Physics-ACPT model maintains significant "Domain Traces" that outperform the base Llama-3-8B model:
|
| 118 |
|
| 119 |
1. **Rejection of Hallucinations:**
|
| 120 |
- *Input:* "Ground state degeneracy"
|
| 121 |
- *Base Model:* Produces **"Degenerenz"** (A linguistic hallucination; a non-existent German word).
|
| 122 |
+
- *Physics-ACPT:* Selects **"Degenerierung"** (A valid, research-oriented German technical term).
|
| 123 |
|
| 124 |
2. **Technical Adjective Selection:**
|
| 125 |
- *Input:* "Reverse shock wave"
|
| 126 |
- *Base Model:* Uses **"rückwärtige"** (A casual, general-purpose word for 'at the back').
|
| 127 |
+
- *Physics-ACPT:* Uses **"rückläufige"** (A specific scientific term for 'retrograde/receding').
|
| 128 |
|
| 129 |
### **Conclusion**
|
| 130 |
+
These results validate the **Semantic Triangulation** hypothesis. By aligning the "Functional Bridge" (Anchor) with "Domain Knowledge" (Monolingual CPT) via causal language modeling, the model shifts its internal probability away from colloquial "guesses" and toward authentic scientific vocabulary, even under the constraints of 4-bit quantization.
|
| 131 |
""")
|
| 132 |
|
| 133 |
if __name__ == "__main__":
|