| <div align="center"> | |
| <img width="1200" height="475" alt="GHBanner" src="https://github.com/user-attachments/assets/0aa67016-6eaf-458a-adb2-6e31a0763ed6" /> | |
| </div> | |
| ## Run Locally | |
| **Prerequisites:** Node.js | |
| 1. Install dependencies: | |
| `npm install` | |
| 2. Set the `GEMINI_API_KEY` in [.env.local](.env.local) to your Gemini API key | |
| 3. Run the app: | |
| `npm run dev` | |
| Workflow Overview | |
| ```mermaid | |
| flowchart TD | |
| subgraph S1 [Phase 1: Data Ingestion] | |
| A[User Selects Working Group] -->|SA1-6, RAN1-2| B[Fetch Meetings via POST] | |
| B --> C[User Selects Meeting] | |
| C --> D[Filter Docs by Metadata] | |
| D --> E[Extract Raw Text] | |
| end | |
| subgraph S2 [Phase 2: Refinement & Caching] | |
| E --> F{Text in Cache?} | |
| F -- Yes --> G[Retrieve Cached Refinement] | |
| F -- No --> H[LLM Processing] | |
| H --> I[Task: Dense Chunking & 'What's New'] | |
| I --> J[Store in Dataset] | |
| J --> G | |
| end | |
| subgraph S3 [Phase 3: Pattern Analysis] | |
| G --> K[User Selects Pattern/Prompt] | |
| K --> L{Result in Cache?} | |
| L -- Yes --> M[Retrieve Analysis] | |
| L -- No --> N[Execute Pattern] | |
| N --> O[Multi-Model Verification] | |
| O --> P[Store Result] | |
| end | |
| S1 --> S2 --> S3 | |
| ``` | |
| ### Detailed Process Specification | |
| #### Phase 1: Data Ingestion & Extraction | |
| The user navigates a strict hierarchy to isolate relevant source text. | |
| 1. **Working Group Selection:** User selects one group from the allowlist: `['SA1', 'SA2', 'SA3', 'SA4', 'SA5', 'SA6', 'RAN1', 'RAN2']`. | |
| 2. **Meeting Retrieval:** System executes a `POST` request to the endpoint using the selected Working Group to retrieve the meeting list. | |
| 3. **Document Filtering:** User selects a meeting, then filters the resulting file list using available metadata. | |
| 4. **Text Extraction:** System extracts raw content from the filtered files into a text list. | |
| #### Phase 2: Content Refinement (with Caching) | |
| Raw text is processed into high-value summaries to reduce noise. | |
| * **Cache Check:** Before processing, check the dataset for existing `(text_hash, refined_output)` pairs to prevent duplicate processing. | |
| * **LLM Processing:** If not cached, pass text to the selected LLM (default provided, user-changeable). | |
| * **Prompt Objective:** | |
| 1. Create information-dense chunks (minimize near-duplicates). | |
| 2. Generate a "What's New" paragraph wrapped in `SUGGESTION START` and `SUGGESTION END` tags. | |
| * **Storage:** Save the input text and the LLM output to the dataset. | |
| #### Phase 3: Pattern Analysis & Verification | |
| Refined text is analyzed using specific user-defined patterns. | |
| * **Pattern Selection:** User applies a specific prompt/pattern to the refined documents. | |
| * **Cache Check:** Check the results database for existing `(document_id, pattern_id)` results. | |
| * **Execution & Verification:** | |
| * Run the selected pattern against the documents. | |
| * **Verifier Mode:** Optionally execute the same input across multiple models simultaneously to compare results and ensure accuracy. | |
| * **Storage:** Save the final analysis in the database to prevent future re-computation. |