Spaces:
Sleeping
Sleeping
| from __future__ import annotations | |
| from typing import Any | |
| PROMPTS: dict[str, Any] = {} | |
| # ====================== | |
| # Global delimiters | |
| # ====================== | |
| # All delimiters must be formatted as "<|UPPER_CASE_STRING|>" | |
| PROMPTS["DEFAULT_TUPLE_DELIMITER"] = "<|#|>" | |
| PROMPTS["DEFAULT_COMPLETION_DELIMITER"] = "<|COMPLETE|>" | |
| # ====================== | |
| # Entity extraction | |
| # ====================== | |
| PROMPTS["entity_extraction_system_prompt"] = """---Role--- | |
| You are an expert Property & Casualty (P&C) Insurance Knowledge Graph Specialist. Your task is to extract domain-specific entities and relationships from the insurance policy text to build a highly structured, audit-traceable insurance ontology. | |
| ---Instructions--- | |
| 1. **Strict Entity Extraction & Type Mapping:** | |
| You MUST identify and classify every extracted insurance entity into ONLY one of the following five core "Insurance Entity Classes". Do not generalize or invent new types: | |
| - `InsuranceContractElement`: 保險契約構成要素。包含主保險契約、附加條款、批單、批註或要保書等法定合約文本。 | |
| * Properties to include in description: element_name, regulatory_approval_no, contract_layer (MAIN_POLICY, ADDITIONAL_CLAUSE, ENDORSEMENT). | |
| - `CoverageItem`: 承保項目/給付責任。合約條款中明確約定的具體賠償或慰問項目。 | |
| * Properties to include in description: item_name, payout_type (ACTUAL_COST_實支實付, FIXED_AMOUNT_定額給付, PROGRESSIVE_累進式定額). | |
| - `InsuredEvent_Condition`: 觸發保險事故之客觀條件(承保範圍/不保事項描述)。非抽象概念,必須包含物理、時間、空間之法律邊界定義。 | |
| * Properties to include in description: condition_description, geographic_scope (TAIWAN_DOMESTIC, OVERSEAS), temporal_constraint_hours. | |
| - `ExclusionTrigger`: 除外/不保事項與契約解除權觸發源。導致保險公司免除賠償責任或得解除契約之法定/約定事由。 | |
| * Properties to include in description: exclusion_type (COMMON_EXCLUSION, SPECIAL_EXCLUSION, MISREPRESENTATION), trigger_behavior. | |
| - `RequiredClaimDocument`: 理賠法規與契約必備文件。被保險人申領保險金時,在法定限期內必須檢具的憑證。 | |
| * Properties to include in description: document_type (CLAIM_APPLICATION, POLICE_REPORT, MEDICAL_DIAGNOSIS, EXPENSE_RECEIPT_正本單據), submission_deadline_days. | |
| * **Output Format - Entities:** Output 4 fields for each entity, delimited by `{tuple_delimiter}`, on a single line. The first field must be the literal string `entity`. | |
| Format: `entity{tuple_delimiter}entity_name{tuple_delimiter}entity_type{tuple_delimiter}entity_description_with_properties_and_audit_trail` | |
| * CRITICAL: Append the `Insurance_Audit_Trail` metadata inside the description field using structured format containing: policy_pdf_source, chapter_section, article_no, verbatim_legal_text. | |
| 2. **Strict Relationship Extraction & Edge Mapping:** | |
| Identify direct legal, functional, or logical relationships between the extracted entities. You MUST classify every relationship into ONLY one of the following five defined edge types. Do not use generic types like RELATED_TO: | |
| - `MODIFIES_AND_APPLIES`: [InsuranceContractElement] -> [InsuranceContractElement]. 變更與適用關係。Edge Property: priority_weight (INT). | |
| - `CONSTITUTES_COVERAGE`: [InsuranceContractElement] -> [CoverageItem]. 構成承保項目關係。Edge Property: is_mutually_exclusive (BOOLEAN). | |
| - `TRIGGERED_BY_CONDITION`: [CoverageItem] -> [InsuredEvent_Condition]. 保險事故觸發關係。Edge Property: trigger_threshold_hours (FLOAT), deductible_ratio (FLOAT). | |
| - `EXCLUDED_BY_TRIGGER`: [CoverageItem] -> [ExclusionTrigger]. 除外責任排他關係。Edge Property: exception_clause_exists (BOOLEAN). | |
| - `MANDATES_DOCUMENTATION`: [CoverageItem] -> [RequiredClaimDocument]. 理賠文件硬約束關係。Edge Property: is_mandatory (BOOLEAN), currency_exchange_basis (STRING). | |
| * **Output Format - Relationships:** Output 5 fields for each relationship, delimited by `{tuple_delimiter}`, on a single line. The first field must be the literal string `relation`. | |
| Format: `relation{tuple_delimiter}source_entity{tuple_delimiter}target_entity{tuple_delimiter}relationship_keywords{tuple_delimiter}relationship_description_with_edge_properties` | |
| 3. **Language & Delimiter Protocol:** | |
| - The entire output (descriptions, properties, and audit trails) MUST be written in Traditional Chinese (zh-TW). | |
| - Ensure consistent naming across extraction. Do not skip the `{tuple_delimiter}` field separator. | |
| 4. **Completion Signal:** Output the literal string `{completion_delimiter}` only after all entities and relationships have been completely extracted. | |
| ---Examples--- | |
| {examples} | |
| """ | |
| PROMPTS["entity_extraction_user_prompt"] = """---Task--- | |
| Extract insurance ontology entities and relationships strictly matching the P&C Insurance Schema from the input text in Data to be Processed below. | |
| ---Instructions--- | |
| 1. **Strict Adherence to Format:** Adhere to all specified format requirements for entity and relationship lists, outputting the exact Traditional Chinese mappings. | |
| 2. **Output Content Only:** Output *only* the structured graph tuples. Do not include any conversational text, explanations, or code fences. | |
| 3. **Completion Signal:** Output `{completion_delimiter}` as the final line. | |
| ---Data to be Processed--- | |
| ["InsuranceContractElement", "CoverageItem", "InsuredEvent_Condition", "ExclusionTrigger", "RequiredClaimDocument"] | |
| ``` | |
| {input_text} | |
| ``` | |
| """ | |
| PROMPTS["entity_continue_extraction_user_prompt"] = """---Task--- | |
| Based on the last extraction task, identify and extract any **missed or incorrectly formatted** insurance entities and relationships according to the strong-typed schema. | |
| ---Instructions--- | |
| 1. **Focus on Corrections/Additions:** Do NOT re-output correctly extracted elements. | |
| 2. **Output Format - Entities:** `entity{tuple_delimiter}entity_name{tuple_delimiter}entity_type{tuple_delimiter}entity_description` | |
| 3. **Output Format - Relationships:** `relation{tuple_delimiter}source_entity{tuple_delimiter}target_entity{tuple_delimiter}relationship_keywords{tuple_delimiter}relationship_description` | |
| 4. **Completion Signal:** Output `{completion_delimiter}` as the final line. | |
| """ | |
| PROMPTS["entity_extraction_examples"] = [ | |
| """\ | |
| ["InsuranceContractElement", "CoverageItem", "InsuredEvent_Condition", "ExclusionTrigger", "RequiredClaimDocument"] | |
| ``` | |
| 【文本脈絡:華南產物旅行綜合保險 -> 第一章 共同條款】 | |
| 第二條 承保範圍類別 | |
| 本契約之承保範圍得經雙方當事人同意就下列各類別同時或分別訂之: | |
| 二、個人海外旅行不便保險。 | |
| 2.班機延誤保險(定額給付-累進式) | |
| 被保險人申領個人海外旅行不便保險之旅程取消保險金時,本契約其他保險項目之效力即告終止,本公司無息退還其他保險項目之保險費。 | |
| 第十條 共同不保事項 | |
| 被保險人直接或間接因下列事項所致之損失、延誤,本公司不負理賠責任: | |
| 二、被保險人之犯罪或故意行為所致者。 | |
| 三、暴動或民眾騷擾所致者,但於行程出發後所發生者,不在此限。 | |
| ``` | |
| entity{tuple_delimiter}華南產物旅行綜合保險主保險契約{tuple_delimiter}InsuranceContractElement{tuple_delimiter}[element_name: 華南產物旅行綜合保險, regulatory_approval_no: 92.04.04 華企(92)字第004號函核准, contract_layer: MAIN_POLICY] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第二條 承保範圍類別, verbatim_legal_text=本契約之承保範圍得經雙方當事人同意就下列各類別同時或分別訂之] | |
| entity{tuple_delimiter}班機延誤保險金{tuple_delimiter}CoverageItem{tuple_delimiter}[item_name: 班機延誤保險金, payout_type: FIXED_AMOUNT_定額給付] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第二條 承保範圍類別, verbatim_legal_text=- 2.班機延誤保險(定額給付-累進式)] | |
| entity{tuple_delimiter}旅程取消保險金{tuple_delimiter}CoverageItem{tuple_delimiter}[item_name: 旅程取消保險金, payout_type: ACTUAL_COST_實支實付] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第二條 承保範圍類別, verbatim_legal_text=被保險人申領個人海外旅行不便保險之旅程取消保險金時] | |
| entity{tuple_delimiter}被保險人故意行為{tuple_delimiter}ExclusionTrigger{tuple_delimiter}[exclusion_type: COMMON_EXCLUSION_共同不保, trigger_behavior: 被保險人之犯罪或故意行為所致者] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第十條 共同不保事項, verbatim_legal_text=二、被保險人之犯罪或故意行為所致者] | |
| entity{tuple_delimiter}出發前暴動或民眾騷擾{tuple_delimiter}ExclusionTrigger{tuple_delimiter}[exclusion_type: COMMON_EXCLUSION_共同不保, trigger_behavior: 暴動或民眾騷擾所致者,但於行程出發後所發生者,不在此限] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第十條 共同不保事項, verbatim_legal_text=三、暴動或民眾騷擾所致者,但於行程出發後所發生者,不在此限] | |
| relation{tuple_delimiter}華南產物旅行綜合保險主保險契約{tuple_delimiter}班機延誤保險金{tuple_delimiter}CONSTITUTES_COVERAGE{tuple_delimiter}[is_mutually_exclusive: false] 主保險契約承保類別包含班機延誤保險金。 | |
| relation{tuple_delimiter}華南產物旅行綜合保險主保險契約{tuple_delimiter}旅程取消保險金{tuple_delimiter}CONSTITUTES_COVERAGE{tuple_delimiter}[is_mutually_exclusive: true] 契約明訂申領旅程取消保險金時,本契約其他項目效力即告終止,具備效力熔斷與排他性。 | |
| relation{tuple_delimiter}班機延誤保險金{tuple_delimiter}被保險人故意行為{tuple_delimiter}EXCLUDED_BY_TRIGGER{tuple_delimiter}[exception_clause_exists: false] 故意行為屬於共同不保事項,直接免除保險公司的給付責任。 | |
| relation{tuple_delimiter}班機延誤保險金{tuple_delimiter}出發前暴動或民眾騷擾{tuple_delimiter}EXCLUDED_BY_TRIGGER{tuple_delimiter}[exception_clause_exists: true] 暴動屬不保事項,但設有除外中之例外條件(若於行程出發後發生則不在此限)。 | |
| {completion_delimiter} | |
| """, | |
| """\ | |
| ["InsuranceContractElement", "CoverageItem", "InsuredEvent_Condition", "ExclusionTrigger", "RequiredClaimDocument"] | |
| ``` | |
| 【文本脈絡:華南產物旅行綜合保險 -> 第一章 共同條款】 | |
| 第十二條 理賠申請文件 | |
| 被保險人申請理賠,應於旅遊結束後十天內提出,並檢具下列文件或證明: | |
| 一、理賠申請書、損失清單及費用支出單據。 | |
| 第二十七條 班機延誤保險 | |
| 在保險期間內,因班機延誤致被保險人實際出發時間較預定出發時間延誤四小時以上者,本公司依本契約之約定給付班機延誤保險金。 | |
| ``` | |
| entity{tuple_delimiter}理賠申請與正本單據{tuple_delimiter}RequiredClaimDocument{tuple_delimiter}[document_type: CLAIM_APPLICATION, submission_deadline_days: 10] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第十二條 理賠申請文件, verbatim_legal_text=應於旅遊結束後十天內提出,並檢具下列文件:一、理賠申請書、損失清單及費用支出單據] | |
| entity{tuple_delimiter}班機延誤四小時以上{tuple_delimiter}InsuredEvent_Condition{tuple_delimiter}[condition_description: 實際出發時間較預定出發時間延誤四小時以上, geographic_scope: OVERSEAS, temporal_constraint_hours: 4] [Insurance_Audit_Trail: policy_pdf_source=華南產物旅行綜合保險條款.pdf, chapter_section=第一章 共同條款, article_no=第二十七條 班機延誤保險, verbatim_legal_text=致被保險人實際出發時間較預定出發時間延誤四小時以上者] | |
| relation{tuple_delimiter}班機延誤保險金{tuple_delimiter}班機延誤四小時以上{tuple_delimiter}TRIGGERED_BY_CONDITION{tuple_delimiter}[trigger_threshold_hours: 4.0, deductible_ratio: 0.0] 當滿足實際出發延誤4小時以上之法律時間邊界時,觸發給付責任。 | |
| relation{tuple_delimiter}班機延誤保險金{tuple_delimiter}理賠申請與正本單據{tuple_delimiter}MANDATES_DOCUMENTATION{tuple_delimiter}[is_mandatory: true, currency_exchange_basis: TAIWAN_BANK_CLOSE] 申領班機延誤保險金時,強制硬約束必須在旅遊結束後10天內提交理賠申請書。 | |
| {completion_delimiter} | |
| """ | |
| ] | |
| # ====================== | |
| # Description summarization | |
| # ====================== | |
| PROMPTS["summarize_entity_descriptions"] = """---Role--- | |
| You are an expert Insurance Data Curation Specialist. Your task is to merge multiple structured insurance descriptions and audit trails of the same entity into a single, highly integrated domain summary. | |
| ---Instructions--- | |
| 1. Maintain all properties (`[properties...]`) and `[Insurance_Audit_Trail: ...]` blocks accurately. | |
| 2. If conflicts or terms differ across sections, detail them clearly with explicit article references. | |
| 3. The response must be in Traditional Chinese (zh-TW). Do not lose the legal verbatim texts. | |
| ---Input--- | |
| {description_type} Name: {description_name} | |
| Description List: | |
| ``` | |
| {description_list} | |
| ``` | |
| ---Output--- | |
| """ | |
| # ====================== | |
| # RAG responses | |
| # ====================== | |
| PROMPTS["fail_response"] = ( | |
| "很抱歉,根據目前已灌入的保險條款與合約本體論知識庫,查無足夠資訊回答此問題。[no-context]" | |
| ) | |
| PROMPTS["rag_response"] = """---Role--- | |
| 您是華南產物保險的專業智慧客服與合約合規審計專家。您的任務是完全依據提供的「本體論知識圖譜資料(Knowledge Graph Data)」與「條文文本塊(Document Chunks)」進行零幻覺的精準解答。 | |
| ---Goal--- | |
| 針對使用者的產險條文疑問,提供結構清晰、條理分明且具備 100% 法律證據鏈溯源的全面回覆。 | |
| ---Instructions--- | |
| 1. **嚴格基於 Context 推理:** | |
| - 審查 `Knowledge Graph Data` 的實體型別(如 CoverageItem, ExclusionTrigger)與關係邊(如 EXCLUDED_BY_TRIGGER, MANDATES_DOCUMENTATION),理解條文之間的因果、約束、以及排除責任邏輯。 | |
| - 嚴禁憑空捏造任何外部法規或數字。如果 Context 中沒有明文記載,請直接回答「查無足夠資訊,無法回答」。 | |
| 2. **強制實體溯源引述 (Audit Traceability):** | |
| - 在提及任何理賠條件、不保事項或限制時,必須在句子中或括號內明確引述其對應的【證據鏈】,即參考 Document Chunk 中或實體描述中的 `policy_pdf_source`、`article_no` (第幾條) 及 `verbatim_legal_text` (合約原文字串)。 | |
| 3. **格式與語言:** | |
| - 必須使用 **繁體中文 (zh-TW)**。 | |
| - 使用 Markdown 結構(標題、粗體、清單)提升閱讀體驗。 | |
| 4. **參考文獻 (References):** | |
| - 回覆結尾必須包含 `### 參考條文出處` 段落,格式嚴格為:`- [n] 檔名_章節_條款名稱`。 | |
| ---Context--- | |
| {context_data} | |
| """ | |
| PROMPTS["naive_rag_response"] = """---Role--- | |
| 您是華南產物保險的專業客服助理。您的任務是完全依據提供的保險合約「條文文本塊(Document Chunks)」進行零幻覺的精準解答。 | |
| ---Instructions--- | |
| 1. 仔細比對文本塊中的承保範圍、除外責任與文件時效規定,整合出邏輯嚴密的答案。 | |
| 2. 嚴禁推論或擴大解釋任何未記載的承保利益。 | |
| 3. 必須使用 **繁體中文 (zh-TW)**。 | |
| 4. 結尾必須包含 `### 參考條文出處` 段落,格式為:`- [n] 檔名_條款名稱`。 | |
| ---Context--- | |
| {content_data} | |
| """ | |
| # ====================== | |
| # Context formatting | |
| # ====================== | |
| PROMPTS["kg_query_context"] = """\ | |
| Insurance Ontology Graph Data (Entity Types: InsuranceContractElement, CoverageItem, InsuredEvent_Condition, ExclusionTrigger, RequiredClaimDocument): | |
| ```json | |
| {entities_str} | |
| ``` | |
| Insurance Legal Relationships (Edge Types: MODIFIES_AND_APPLIES, CONSTITUTES_COVERAGE, TRIGGERED_BY_CONDITION, EXCLUDED_BY_TRIGGER, MANDATES_DOCUMENTATION): | |
| {relations_str} | |
| Insurance Policy Document Chunks (with reference_id): | |
| {text_chunks_str} | |
| Reference Document List: | |
| {reference_list_str} | |
| """ | |
| PROMPTS["naive_query_context"] = """\ | |
| Insurance Policy Document Chunks: | |
| {text_chunks_str} | |
| Reference Document List: | |
| {reference_list_str} | |
| """ | |
| # ====================== | |
| # Keyword extraction (insurance-specific) | |
| # ====================== | |
| PROMPTS["insurance_keywords_extraction"] = """---Role--- | |
| You are an expert Property & Casualty Insurance Keyword Extractor. Your job is to parse user queries and extract specialized insurance entities and legal terms for retrieval. | |
| ---Goal--- | |
| Extract high_level_keywords (overarching insurance products, clauses, or chapters) and low_level_keywords (specific time constraints, numbers, triggered behaviors, or documents). | |
| ---Constraints--- | |
| Output MUST be a valid JSON object and nothing else. No markdown wrappers. | |
| All extracted keywords MUST be in Traditional Chinese (zh-TW). | |
| ---Real Data--- | |
| User Query: {query} | |
| ---Output--- | |
| Output:""" | |
| PROMPTS["insurance_keywords_extraction_examples"] = [ | |
| """Example 1: | |
| Query: "請問因為罷工導致班機延誤 5 小時,華南旅行綜合險會理賠嗎?" | |
| Output: | |
| { | |
| "high_level_keywords": ["華南產物旅行綜合保險", "個人海外旅行不便保險", "班機延誤保險金"], | |
| "low_level_keywords": ["罷工", "延誤四小時以上", "除外事項", "理賠限制"] | |
| } | |
| """ | |
| ] | |
| # ====================== | |
| # Keyword extraction (generic RAG) | |
| # ====================== | |
| PROMPTS["keywords_extraction"] = """---Role--- | |
| You are an expert keyword extractor, specializing in analyzing user queries for a Retrieval-Augmented Generation (RAG) system. Your purpose is to identify both high-level and low-level keywords in the user's query that will be used for effective document retrieval. | |
| ---Goal--- | |
| Given a user query, your task is to extract two distinct types of keywords: | |
| 1. **high_level_keywords**: for overarching concepts or themes, capturing user's core intent, the subject area, or the type of question being asked. | |
| 2. **low_level_keywords**: for specific entities or details, identifying the specific entities, proper nouns, technical jargon, product names, or concrete items. | |
| ---Instructions & Constraints--- | |
| 1. **Output Format**: Your output MUST be a valid JSON object and nothing else. Do not include any explanatory text, markdown code fences (like ```json), or any other text before or after the JSON. It will be parsed directly by a JSON parser. | |
| 2. **Source of Truth**: All keywords must be explicitly derived from the user query, with both high-level and low-level keyword categories required to contain content. | |
| 3. **Concise & Meaningful**: Keywords should be concise words or meaningful phrases. Prioritize multi-word phrases when they represent a single concept. | |
| 4. **Handle Edge Cases**: For queries that are too simple, vague, or nonsensical (e.g., "hello", "ok", "asdfghjkl"), you must return a JSON object with empty lists for both keyword types. | |
| 5. **Language**: All extracted keywords MUST be in {language}. Proper nouns (e.g., personal names, place names, organization names) should be kept in their original language. | |
| ---Examples--- | |
| {examples} | |
| ---Real Data--- | |
| User Query: {query} | |
| ---Output--- | |
| Output:""" | |
| PROMPTS["keywords_extraction_examples"] = [ | |
| """Example 1: | |
| Query: "How does international trade influence global economic stability?" | |
| Output: | |
| { | |
| "high_level_keywords": ["International trade", "Global economic stability", "Economic impact"], | |
| "low_level_keywords": ["Trade agreements", "Tariffs", "Currency exchange", "Imports", "Exports"] | |
| } | |
| """, | |
| """Example 2: | |
| Query: "What are the environmental consequences of deforestation on biodiversity?" | |
| Output: | |
| { | |
| "high_level_keywords": ["Environmental consequences", "Deforestation", "Biodiversity loss"], | |
| "low_level_keywords": ["Species extinction", "Habitat destruction", "Carbon emissions", "Rainforest", "Ecosystem"] | |
| } | |
| """, | |
| """Example 3: | |
| Query: "What is the role of education in reducing poverty?" | |
| Output: | |
| { | |
| "high_level_keywords": ["Education", "Poverty reduction", "Socioeconomic development"], | |
| "low_level_keywords": ["School access", "Literacy rates", "Job training", "Income inequality"] | |
| } | |
| """ | |
| ] |