Spaces:

st192011
/

Janus_Interface

Sleeping

App Files Files Community

st192011 commited on Jan 22

Commit

531b9e2

verified ·

1 Parent(s): ffbc724

Update app.py

Browse files

Files changed (1) hide show

app.py +41 -11

app.py CHANGED Viewed

@@ -203,8 +203,6 @@ report_md = """
 # 🏛️ The Janus Interface: Research & Technical Analysis
 **Project Status:** Research Prototype v2.0 (Gold Standard)
----
 ### 1. Research Motivation: The Privacy-Utility Paradox
 In regulated domains (Healthcare, Legal, Finance), Generative AI adoption is stalled by a fundamental conflict:
 *   **Utility:** Large Cloud Models (GPT-4, Claude) offer superior reasoning but require sending data off-premise.
@@ -227,11 +225,8 @@ The system utilizes a **Multi-Task Adapter** trained to switch between two disti
 *   **Function:** A secure, offline engine that accepts the JanusScript and a Local SQL Database record.
 *   **Output:** It merges the abstract logic with the concrete identity to generate the final, human-readable document.
----
 ### 3. Data Engineering: The "Gold Standard" Pipeline
 To achieve high fidelity without using private patient data, we developed a **Synthesized Data Pipeline**:
 1.  **Synthesis:** We generated **306 high-quality clinical scenarios** using Large Language Models (LLMs).
 2.  **Alignment:** Unlike previous iterations where headers were random, this dataset ensured strict mathematical alignment between the Identity Header (Age/Sex) and the Clinical Narrative.
 3.  **Result:** This eliminated the "hallucination" issues seen in earlier tests where the model would confuse patient gender or age due to conflicting training signals.
@@ -240,15 +235,50 @@ To achieve high fidelity without using private patient data, we developed a **Sy
 *   **Base Model:** Microsoft Phi-3.5-mini-instruct (3.8B Parameters).
 *   **Framework:** **Unsloth** (Optimized QLoRA).
 *   **Technique:** **DoRA (Weight-Decomposed Low-Rank Adaptation)**.
-    *   *Why DoRA?* Standard LoRA struggles with strict syntax/coding tasks. DoRA updates both magnitude and direction vectors, allowing the model to learn the strict `JanusScript` grammar effectively.
 *   **Loss Masking:** We used `train_on_responses_only`. The model was **never** trained on the input text, only on the output. This prevents the model from memorizing patient PII from the training set.
-*   **Hyperparameters:** Rank 16, Alpha 16, Learning Rate 2e-4, **2 Epochs** (approx 78 steps used for final checkpoint).
-### 5. Results & Conclusion
-*   **Zero-Trust Validation:** The "Vault" successfully reconstructs documents using *only* the database for identity.
-*   **Semantic Expansion:** The model demonstrates the ability to take a concise code (`Dx(Pneumonia)`) and expand it into fluent medical narrative ("Patient presented with symptoms consistent with Pneumonia...").
 """
 # ==============================================================================
 # 6. LAUNCHER
 # ==============================================================================

 # 🏛️ The Janus Interface: Research & Technical Analysis
 **Project Status:** Research Prototype v2.0 (Gold Standard)
 ### 1. Research Motivation: The Privacy-Utility Paradox
 In regulated domains (Healthcare, Legal, Finance), Generative AI adoption is stalled by a fundamental conflict:
 *   **Utility:** Large Cloud Models (GPT-4, Claude) offer superior reasoning but require sending data off-premise.
 *   **Function:** A secure, offline engine that accepts the JanusScript and a Local SQL Database record.
 *   **Output:** It merges the abstract logic with the concrete identity to generate the final, human-readable document.
 ### 3. Data Engineering: The "Gold Standard" Pipeline
 To achieve high fidelity without using private patient data, we developed a **Synthesized Data Pipeline**:
 1.  **Synthesis:** We generated **306 high-quality clinical scenarios** using Large Language Models (LLMs).
 2.  **Alignment:** Unlike previous iterations where headers were random, this dataset ensured strict mathematical alignment between the Identity Header (Age/Sex) and the Clinical Narrative.
 3.  **Result:** This eliminated the "hallucination" issues seen in earlier tests where the model would confuse patient gender or age due to conflicting training signals.
 *   **Base Model:** Microsoft Phi-3.5-mini-instruct (3.8B Parameters).
 *   **Framework:** **Unsloth** (Optimized QLoRA).
 *   **Technique:** **DoRA (Weight-Decomposed Low-Rank Adaptation)**.
 *   **Loss Masking:** We used `train_on_responses_only`. The model was **never** trained on the input text, only on the output. This prevents the model from memorizing patient PII from the training set.
+*   **Hyperparameters:** Rank 16, Alpha 16, Learning Rate 2e-4, 2 Epochs (approx 78 steps used for final checkpoint).
+### 5. Validated Output Examples
+*Due to hardware constraints on the Free Tier CPU, live inference may be slow. Below are validated outputs from the model running on T4 GPU.*
+#### **Phase 1: The Scout (Logic Extraction)**
+*Input (Raw Note):* "Pt admitted for appendicitis. 45M. CT showed inflamed appendix. Taken to OR for Lap Appy. Uncomplicated. Discharged home on Percocet."
+*Output (JanusScript):*
+```javascript
+Hx(Appendicitis.Suspected);
+Sx(Pain.RightLowerQuadrant);
+Dx(Appendicitis.Confirmed);
+Lab(CT).result(InflamedAppendix);
+Tx(Surgery).action(LaparoscopicAppendectomy);
+Crs(PostOp).status(Uncomplicated);
+Tx(Meds).action(Percocet);
+Plan(Discharge.Home).
+Phase 2: The Vault (Reconstruction)
+Input (Protocol): Dx(Pneumonia); Tx(Meds).action(Antibiotics.IV); Crs(Response).result(Improved); Plan(Discharge.Home);
+Input (Secure DB): {"pt_name": "Walter White", "pt_mrn": "HEISEN-01", ...}
+Output (Final Document):
+code
+Text
+DISCHARGE SUMMARY
+Patient: Walter White (MRN: HEISEN-01)
+DOB: 1958-09-07 (Sex: M)
+Attending: Dr. Goodman (Internal Medicine)
+Dates: 2025-05-01 to 2025-05-05
+**History & Diagnosis:**
+The patient is a 67-year-old male who presented with fever, productive cough, and shortness of breath. Diagnosis was Pneumonia.
+**Hospital Course:**
+He was treated with IV antibiotics. His respiratory status improved, and he was able to maintain oxygen saturation on room air.
+**Discharge Plan:**
+The patient is discharged home.
+6. Conclusion
+Zero-Trust Validation: The "Vault" successfully reconstructs documents using only the database for identity.
+Semantic Expansion: The model demonstrates the ability to take a concise code (Dx(Pneumonia)) and expand it into fluent medical narrative ("Patient presented with symptoms consistent with Pneumonia...").
 """
 # ==============================================================================
 # 6. LAUNCHER
 # ==============================================================================