{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# KGLLMの動作例を作成する" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting transformers\n", " Using cached transformers-4.48.1-py3-none-any.whl.metadata (44 kB)\n", "Collecting torch\n", " Using cached torch-2.5.1-cp310-cp310-manylinux1_x86_64.whl.metadata (28 kB)\n", "Collecting accelerate\n", " Using cached accelerate-1.3.0-py3-none-any.whl.metadata (19 kB)\n", "Collecting filelock (from transformers)\n", " Using cached filelock-3.17.0-py3-none-any.whl.metadata (2.9 kB)\n", "Collecting huggingface-hub<1.0,>=0.24.0 (from transformers)\n", " Using cached huggingface_hub-0.27.1-py3-none-any.whl.metadata (13 kB)\n", "Collecting numpy>=1.17 (from transformers)\n", " Using cached numpy-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)\n", "Requirement already satisfied: packaging>=20.0 in /home/mnktsts/repo/Fujitsu-LLM-KG/README/.venv/lib/python3.10/site-packages (from transformers) (24.2)\n", "Collecting pyyaml>=5.1 (from transformers)\n", " Using cached PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)\n", "Collecting regex!=2019.12.17 (from transformers)\n", " Using cached regex-2024.11.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)\n", "Collecting requests (from transformers)\n", " Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)\n", "Collecting tokenizers<0.22,>=0.21 (from transformers)\n", " Using cached tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)\n", "Collecting safetensors>=0.4.1 (from transformers)\n", " Using cached safetensors-0.5.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)\n", "Collecting tqdm>=4.27 (from transformers)\n", " Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)\n", "Requirement already satisfied: typing-extensions>=4.8.0 in /home/mnktsts/repo/Fujitsu-LLM-KG/README/.venv/lib/python3.10/site-packages (from torch) (4.12.2)\n", "Collecting networkx (from torch)\n", " Using cached networkx-3.4.2-py3-none-any.whl.metadata (6.3 kB)\n", "Collecting jinja2 (from torch)\n", " Using cached jinja2-3.1.5-py3-none-any.whl.metadata (2.6 kB)\n", "Collecting fsspec (from torch)\n", " Using cached fsspec-2024.12.0-py3-none-any.whl.metadata (11 kB)\n", "Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)\n", " Using cached nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", "Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)\n", " Using cached nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", "Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)\n", " Using cached nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", "Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)\n", " Using cached nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", "Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)\n", " Using cached nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", "Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)\n", " Using cached nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", "Collecting nvidia-curand-cu12==10.3.5.147 (from torch)\n", " Using cached nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", "Collecting nvidia-cusolver-cu12==11.6.1.9 (from torch)\n", " Using cached nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", "Collecting nvidia-cusparse-cu12==12.3.1.170 (from torch)\n", " Using cached nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", "Collecting nvidia-nccl-cu12==2.21.5 (from torch)\n", " Using cached nvidia_nccl_cu12-2.21.5-py3-none-manylinux2014_x86_64.whl.metadata (1.8 kB)\n", "Collecting nvidia-nvtx-cu12==12.4.127 (from torch)\n", " Using cached nvidia_nvtx_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.7 kB)\n", "Collecting nvidia-nvjitlink-cu12==12.4.127 (from torch)\n", " Using cached nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", "Collecting triton==3.1.0 (from torch)\n", " Using cached triton-3.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.3 kB)\n", "Collecting sympy==1.13.1 (from torch)\n", " Using cached sympy-1.13.1-py3-none-any.whl.metadata (12 kB)\n", "Collecting mpmath<1.4,>=1.1.0 (from sympy==1.13.1->torch)\n", " Using cached mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)\n", "Requirement already satisfied: psutil in /home/mnktsts/repo/Fujitsu-LLM-KG/README/.venv/lib/python3.10/site-packages (from accelerate) (6.1.1)\n", "Collecting MarkupSafe>=2.0 (from jinja2->torch)\n", " Using cached MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.0 kB)\n", "Collecting charset-normalizer<4,>=2 (from requests->transformers)\n", " Using cached charset_normalizer-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (35 kB)\n", "Collecting idna<4,>=2.5 (from requests->transformers)\n", " Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)\n", "Collecting urllib3<3,>=1.21.1 (from requests->transformers)\n", " Using cached urllib3-2.3.0-py3-none-any.whl.metadata (6.5 kB)\n", "Collecting certifi>=2017.4.17 (from requests->transformers)\n", " Using cached certifi-2024.12.14-py3-none-any.whl.metadata (2.3 kB)\n", "Using cached transformers-4.48.1-py3-none-any.whl (9.7 MB)\n", "Using cached torch-2.5.1-cp310-cp310-manylinux1_x86_64.whl (906.4 MB)\n", "Using cached nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl (363.4 MB)\n", "Using cached nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (13.8 MB)\n", "Using cached nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB)\n", "Using cached nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (883 kB)\n", "Using cached nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)\n", "Using cached nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl (211.5 MB)\n", "Using cached nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl (56.3 MB)\n", "Using cached nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl (127.9 MB)\n", "Using cached nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl (207.5 MB)\n", "Using cached nvidia_nccl_cu12-2.21.5-py3-none-manylinux2014_x86_64.whl (188.7 MB)\n", "Using cached nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)\n", "Using cached nvidia_nvtx_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (99 kB)\n", "Using cached sympy-1.13.1-py3-none-any.whl (6.2 MB)\n", "Using cached triton-3.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (209.5 MB)\n", "Using cached accelerate-1.3.0-py3-none-any.whl (336 kB)\n", "Using cached huggingface_hub-0.27.1-py3-none-any.whl (450 kB)\n", "Using cached fsspec-2024.12.0-py3-none-any.whl (183 kB)\n", "Using cached numpy-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.4 MB)\n", "Using cached PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB)\n", "Using cached regex-2024.11.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (781 kB)\n", "Using cached safetensors-0.5.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (461 kB)\n", "Using cached tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)\n", "Using cached tqdm-4.67.1-py3-none-any.whl (78 kB)\n", "Using cached filelock-3.17.0-py3-none-any.whl (16 kB)\n", "Using cached jinja2-3.1.5-py3-none-any.whl (134 kB)\n", "Using cached networkx-3.4.2-py3-none-any.whl (1.7 MB)\n", "Using cached requests-2.32.3-py3-none-any.whl (64 kB)\n", "Using cached certifi-2024.12.14-py3-none-any.whl (164 kB)\n", "Using cached charset_normalizer-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (146 kB)\n", "Using cached idna-3.10-py3-none-any.whl (70 kB)\n", "Using cached MarkupSafe-3.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20 kB)\n", "Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)\n", "Using cached urllib3-2.3.0-py3-none-any.whl (128 kB)\n", "Installing collected packages: mpmath, urllib3, tqdm, sympy, safetensors, regex, pyyaml, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, MarkupSafe, idna, fsspec, filelock, charset-normalizer, certifi, triton, requests, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, huggingface-hub, torch, tokenizers, transformers, accelerate\n", "Successfully installed MarkupSafe-3.0.2 accelerate-1.3.0 certifi-2024.12.14 charset-normalizer-3.4.1 filelock-3.17.0 fsspec-2024.12.0 huggingface-hub-0.27.1 idna-3.10 jinja2-3.1.5 mpmath-1.3.0 networkx-3.4.2 numpy-2.2.2 nvidia-cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-nccl-cu12-2.21.5 nvidia-nvjitlink-cu12-12.4.127 nvidia-nvtx-cu12-12.4.127 pyyaml-6.0.2 regex-2024.11.6 requests-2.32.3 safetensors-0.5.2 sympy-1.13.1 tokenizers-0.21.0 torch-2.5.1 tqdm-4.67.1 transformers-4.48.1 triton-3.1.0 urllib3-2.3.0\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "pip install transformers torch accelerate" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/mnktsts/repo/Fujitsu-LLM-KG/README/.venv/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "from kgllm_utils import Fujitsu_LLM_KG, extract_turtle, extract_answer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 共通事前学習済みLLM" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Loading checkpoint shards: 100%|██████████| 39/39 [00:35<00:00, 1.11it/s]\n" ] } ], "source": [ "kgllm = Fujitsu_LLM_KG(\"../../Fujitsu-LLM-KG-8x7B_cpt\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "## Knowledge Graph\n", "```turtle\n", "#@rationale: 宗像聡は富士通に2010年から勤めています。\n", "<#宗像聡>\n", " rel:employer [\n", " rdf:object <#富士通>;\n", " rel:start_time <#2010>\n", " ].\n", "\n", "#@rationale: 彼はFujitsu-LLM-KG-8x7B_cptを開発しました。\n", "<#宗像聡>\n", " rel:notable_work <#Fujitsu-LLM-KG-8x7B_cpt>.\n", "```\n" ] } ], "source": [ "prompt = \"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", "\n", "## Source\n", "```txt\n", "宗像聡は富士通に2010年から勤めています。\n", "彼はFujitsu-LLM-KG-8x7B_cptを開発しました。\n", "```\n", "\n", "## Strategy\n", "Extract all verifiable facts in \"Source\" as knowledge triples.\n", "[/INST]\n", "\"\"\".strip()\n", "\n", "generated = kgllm.generate(prompt)\n", "print(generated)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "## Text\n", "```txt\n", "Satoshi Munakata, who started working for Fujitsu in 2010, is the author of the notable work Fujitsu-LLM-KG-8x7B cpt.\n", "```\n" ] } ], "source": [ "prompt = \"\"\"\n", "[INST]\n", "Generate \"Text\" to explain the given knowledge triples in \"Source\".\n", "\n", "## Source\n", "```turtle\n", "<#Satoshi Munakata>\n", " rel:notable_work <#Fujitsu-LLM-KG-8x7B_cpt>;\n", " rel:employer [\n", " rdf:object <#Fujitsu>;\n", " rel:start_time <#2010>\n", " ].\n", "```\n", "\n", "## Strategy\n", "Explain the knowledge triples in \"Source\" without omission, but concisely and fluently.\n", "[/INST]\n", "\"\"\".strip()\n", "\n", "generated = kgllm.generate(prompt)\n", "print(generated)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ナレッジグラフ生成LLM 日本語版" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Loading checkpoint shards: 100%|██████████| 39/39 [00:27<00:00, 1.41it/s]\n" ] } ], "source": [ "kgllm = Fujitsu_LLM_KG(\"../../Fujitsu-LLM-KG-8x7B_inst-gen_ja\")" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Knowledge Graph\n", "```turtle\n", "#@rationale: s0. <#マリー・テレーズ・ドートリッシュ:Person[0]>(<#1638 年 9 月 10 日:Date[6]>日-<#1683 年 7月 30 日:Date[7]>)は、<#フランス:Location[1]>王<#ルイ 14 世:Person[2]>の王妃。\n", "\n", "<#ルイ 14 世:Person[2]>\n", " rel:CountryOfCitizenship <#フランス:Location[1]>.\n", "\n", "#@rationale: s1. 父は<#スペイン:Location[3]>王<#フェリペ 4 世:Person[4]>、母は<#フランス:Location[1]>王<#アンリ 4 世:Person[8]>と<#マリー・ド・メディシス:Person[9]>の娘<#イサベル・デ・ボルボン:Person[5]>。\n", "\n", "<#フェリペ 4 世:Person[4]>\n", " rel:CountryOfCitizenship <#スペイン:Location[3]>.\n", "\n", "<#アンリ 4 世:Person[8]>\n", " rel:CountryOfCitizenship <#フランス:Location[1]>.\n", "\n", "<#イサベル・デ・ボルボン:Person[5]>\n", " rel:CountryOfCitizenship <#フランス:Location[1]>.\n", "\n", "#@rationale: s2. <#スペイン:Location[3]>名は<#マリア・テレサ:Person[0]>。\n", "\n", "#@rationale: s3. <#ルイ 14 世:Person[2]>の父<#ルイ 13 世:Person[10]>は<#イサベル:Person[5]>の兄、母<#アンヌ・ドートリッシュ:Person[11]>は<#フェリペ 4 世:Person[4]>の姉であり、<#ルイ 14世:Person[2]>と<#マリー・テレーズ:Person[0]>とは父方・母方双方で従兄妹に当たる。\n", "\n", "<#ルイ 13 世:Person[10]>\n", " rel:CountryOfCitizenship <#フランス:Location[1]>.\n", "\n", "<#アンヌ・ドートリッシュ:Person[11]>\n", " rel:CountryOfCitizenship <#フランス:Location[1]>.\n", "\n", "#@rationale: s4. <#神聖ローマ:Location[12]>皇帝<#レオポルト 1 世:Person[13]>の皇后<#マルガリータ・テレサ:Person[14]>は異母妹、<#スペイン:Location[3]>・<#ハプスブルク家:Organization[15]>最後の王<#カルロス 2世:Person[16]>は異母弟である。\n", "\n", "<#カルロス 2世:Person[16]>\n", " rel:CountryOfCitizenship <#スペイン:Location[3]>.\n", "```\n" ] } ], "source": [ "\n", "prompt = \"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", "\n", "## Source\n", "```txt\n", "s0. <#マリー・テレーズ・ドートリッシュ:Person[0]>(<#1638 年 9 月 10 日:Date[6]>日-<#1683 年 7月 30 日:Date[7]>)は、<#フランス:Location[1]>王<#ルイ 14 世:Person[2]>の王妃。\n", "s1. 父は<#スペイン:Location[3]>王<#フェリペ 4 世:Person[4]>、母は<#フランス:Location[1]>王<#アンリ 4 世:Person[8]>と<#マリー・ド・メディシス:Person[9]>の娘<#イサベル・デ・ボルボン:Person[5]>。\n", "s2. <#スペイン:Location[3]>名は<#マリア・テレサ:Person[0]>。\n", "s3. <#ルイ 14 世:Person[2]>の父<#ルイ 13 世:Person[10]>は<#イサベル:Person[5]>の兄、母<#アンヌ・ドートリッシュ:Person[11]>は<#フェリペ 4 世:Person[4]>の姉であり、<#ルイ 14世:Person[2]>と<#マリー・テレーズ:Person[0]>とは父方・母方双方で従兄妹に当たる。\n", "s4. <#神聖ローマ:Location[12]>皇帝<#レオポルト 1 世:Person[13]>の皇后<#マルガリータ・テレサ:Person[14]>は異母妹、<#スペイン:Location[3]>・<#ハプスブルク家:Organization[15]>最後の王<#カルロス 2世:Person[16]>は異母弟である。\n", "```\n", "## Strategy\n", "Extract all verifiable facts about relation \"CountryOfCitizenship\" in \"Source\" as knowledge triples.\n", "* Don't speculate, extract only the facts as stated in \"Source\" **WITHOUT OMMISION**. \n", "The relation \"CountryOfCitizenship\" (P27) means \"the object is a country that recognizes the subject as its citizen\".\n", "* The existence of a relation is restricted by the type of subject and object. You must select subjects and objects of triples from the bracketed eigenexpressions, such as \"<#name:type[id]>\", without modification.\n", "[/INST]\n", " \"\"\".strip()\n", "\n", "generated = kgllm.generate(prompt)\n", "print(generated)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ナレッジグラフ生成LLM 英語版" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Loading checkpoint shards: 100%|██████████| 39/39 [00:27<00:00, 1.42it/s]\n" ] } ], "source": [ "kgllm = Fujitsu_LLM_KG(\"../../Fujitsu-LLM-KG-8x7B_inst-gen_en\")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Knowledge Graph\n", "```turtle\n", "#@rationale: s0. <#John Gregory Myre:Person[0]> is an <#American:Location[1]> journalist and <#NPR:Organization[2]>'s digital editor for international news.\n", "\n", "<#John Gregory Myre:Person[0]>\n", " rel:country_of_citizenship <#American:Location[1]>.\n", "\n", "#@rationale: s1. Previously, he was a foreign correspondent for The <#Associated Press:Organization[3]> and The <#New York Times:Organization[4]> for <#20 years:Time[5]>.\n", "\n", "#@rationale: s2. He reported from <#Jerusalem:Location[6]> from <#2003:Time[7]> to <#2007:Time[8]> for The <#New York Times:Organization[4]>.\n", "\n", "#@rationale: s3. With the <#AP:Organization[3]>, <#Myre:Person[0]> was based in <#Jerusalem:Location[6]>, <#Moscow:Location[9]>, <#Nicosia:Location[10]>, <#Islamabad:Location[11]> and <#Johannesburg:Location[12]>.\n", "\n", "#@rationale: s4. He covered many major international stories, including <#Nelson Mandela:Person[13]>'s release from prison in <#1990:Time[14]>, the <#Gulf War:Miscellaneous[15]> of <#1991:Time[16]>, the rise of the <#Taliban:Organization[17]>, the early days of <#Vladimir Putin:Person[18]> as <#Russia:Location[19]>'s leader, the <#Israeli:Location[20]>-<#Palestinian:Location[21]> fighting that erupted in <#2000:Time[22]> and the <#U.S.:Location[23]> war in <#Afghanistan:Location[24]>.\n", "\n", "<#Vladimir Putin:Person[18]>\n", " rel:country_of_citizenship <#Russia:Location[19]>.\n", "\n", "#@rationale: s5. He is also a scholar at the <#Middle East Institute:Organization[25]> in <#Washington:Location[26]>.\n", "```\n" ] } ], "source": [ "prompt = \"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", " \n", "## Source\n", "```txt\n", "s0. <#John Gregory Myre:Person[0]> is an <#American:Location[1]> journalist and <#NPR:Organization[2]>'s digital editor for international news.\n", "s1. Previously, he was a foreign correspondent for The <#Associated Press:Organization[3]> and The <#New York Times:Organization[4]> for <#20 years:Time[5]>.\n", "s2. He reported from <#Jerusalem:Location[6]> from <#2003:Time[7]> to <#2007:Time[8]> for The <#New York Times:Organization[4]>.\n", "s3. With the <#AP:Organization[3]>, <#Myre:Person[0]> was based in <#Jerusalem:Location[6]>, <#Moscow:Location[9]>, <#Nicosia:Location[10]>, <#Islamabad:Location[11]> and <#Johannesburg:Location[12]>.\n", "s4. He covered many major international stories, including <#Nelson Mandela:Person[13]>'s release from prison in <#1990:Time[14]>, the <#Gulf War:Miscellaneous[15]> of <#1991:Time[16]>, the rise of the <#Taliban:Organization[17]>, the early days of <#Vladimir Putin:Person[18]> as <#Russia:Location[19]>'s leader, the <#Israeli:Location[20]>-<#Palestinian:Location[21]> fighting that erupted in <#2000:Time[22]> and the <#U.S.:Location[23]> war in <#Afghanistan:Location[24]>.\n", "s5. He is also a scholar at the <#Middle East Institute:Organization[25]> in <#Washington:Location[26]>.\n", " ```\n", " \n", "## Strategy\n", "Extract all verifiable facts about relation \"country_of_citizenship\" in \"Source\" as knowledge triples.\n", "* Don't speculate, extract only the facts as stated in \"Source\" **WITHOUT OMMISION**.\n", "The relation \"country_of_citizenship\" (P27) means \"the object is a country that recognizes the subject as its citizen\".\n", "* The existence of a relation is restricted by the type of subject and object.\n", "You must select subjects and objects of triples from the bracketed eigenexpressions, such as \"<#name:type[id]>\", without modification.\n", "[/INST]\n", "\"\"\".strip()\n", "\n", "generated = kgllm.generate(prompt)\n", "print(generated)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ナレッジグラフ推論LLM ver.2" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Loading checkpoint shards: 100%|██████████| 39/39 [00:50<00:00, 1.31s/it]\n" ] } ], "source": [ "kgllm = Fujitsu_LLM_KG(\"../../Fujitsu-LLM-KG-8x7B_inst-infer_v2\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Knowledge Graph\n", "```turtle\n", "<#お正月>\n", " rel:has_duration .\n", "<#お盆>\n", " rel:has_duration .\n", "<#Answer>\n", " rel:is <#お正月>;\n", " rel:when [\n", " \n", " rel:greater_than .\n", " ];\n", " rel:else <#お盆>.\n", "```\n" ] } ], "source": [ "question = \"お正月とお盆とでは、どちらがより長期間の休日ですか?\"\n", "\n", "prompt_schema = f\"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", "\n", "## Source\n", "```txt\n", "{question}\n", "```\n", " \n", "## Strategy\n", "Extract graph schema needed to answer the question in above \"Source\" as knowledge triples without omission.\n", "[/INST]\n", "\"\"\".strip()\n", "\n", "generated_schema = kgllm.generate(prompt_schema)\n", "print(generated_schema)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Knowledge Graph\n", "```turtle\n", "<#お正月>\n", " #@rationale: Title: 第20条「休日」 - ③年末年始(1月1日~1月3日)\n", " rel:has_duration <#3>;\n", "<#お盆>\n", " #@rationale: Title: 第20条「休日」 - ④夏季休日(8月13日~8月16日)\n", " rel:has_duration <#4>.\n", "<#Answer>\n", " rel:is <#お正月>;\n", " rel:when [\n", " <#3>\n", " rel:greater_than <#4>.\n", " ];\n", " rel:else <#お盆>.\n", "```\n" ] } ], "source": [ "prompt_kg = f\"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", "\n", "## Source\n", "```txt\n", "Title: 第19条「労働時間及び休憩時間」\n", "\n", "1 労働者代表と1年単位の変形労働時間制に関する労使協定を締結した場合、当該協定の適用を受ける労働者について、1週間の所定労働時間は、対象期間を平均して1週間当たり40時間とする。\n", "2 1年単位の変形労働時間制を適用しない労働者について、1週間の所定労働時間は40時間、1日の所定労働時間は8時間とする。\n", "\n", "----\n", "Title: 第20条「休日」\n", "\n", "1 1年単位の変形労働時間制の適用を受ける労働者の休日については、1年単位の変形労働時間制に関する労使協定の定めるところにより、対象期間の初日を起算日とする1週間ごとに1日以上、1年間に125日以上となるように指定する。その場合、年間休日カレンダーに定め、対象期間の初日の30日前までに各労働者に通知する。\n", "2 1年単位の変形労働時間制を適用しない労働者の休日については、以下のとおり指定し、月間休日カレンダーに定め、対象期間の初日の30日前までに各労働者に通知する。\n", " ①日曜日(前条第3号の特定期間を除く。)\n", " ②国民の祝日(日曜日と重なったときは翌日)\n", " ③年末年始(1月1日~1月3日)\n", " ④夏季休日(8月13日~8月16日)\n", " ⑤その他会社が指定する日\n", "\n", "----\n", "Title: 第21条「時間外及び休日労働等」\n", "\n", "1 業務の都合により、第19条の所定労働時間を超え、又は第20条の所定休日に労働させることがある。\n", "2 前項の場合、法定労働時間を超える労働又は法定休日における労働については、あらかじめ会社は労働者の過半数代表者と書面による労使協定を締結するとともに、これを所轄の労働基準監督署長に届け出るものとする。\n", "3 妊娠中の女性、産後1年を経過しない女性労働者(以下「妊産婦」という)であって請求した者及び18歳未満の者については、第2項による時間外労働又は休日若しくは深夜(午後10時から午前5時まで)労働に従事させない。\n", "4 災害その他避けることのできない事由によって臨時の必要がある場合には、第1項から前項までの制限を超えて、所定労働時間外又は休日に労働させることがある。ただし、この場合であっても、請求のあった妊産婦については、所定労働時間外労働又は休日労働に従事させない。\n", "\n", "----\n", "Title: 第26条「育児時間」\n", "\n", "1 1歳に満たない子を養育する女性労働者から請求があったときは、休憩時間のほか1日について2回、1回について30分の育児時間を与える。\n", "```\n", " \n", "## Strategy\n", "Extract all verifiable facts in above \"Source\", that match the following graph schema, as knowledge triples.\n", "```turtle\n", "{extract_turtle(generated_schema)}\n", "```\n", "The extracted facts are needed to answer the question \"{question}\".\n", "However, if no useful facts are found in the above \"Source\", do not output any triples.\n", "[/INST]\n", "\"\"\".strip()\n", "\n", "generated_kg = kgllm.generate(prompt_kg)\n", "print(generated_kg)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Explore Path\n", "```path\n", "お正月は3日間の休日です.\n", "お盆は4日間の休日です.\n", "Answer is 4 because 4 is greater than 3.\n", "```\n", "\n", "## Answer\n", "```txt\n", "お盆\n", "```\n" ] } ], "source": [ "prompt_answer = f\"\"\"\n", "[INST]\n", "Explore \"Knowledge Graph\" entity-to-entity then finally answer \"Question\".\n", "\n", "## Knowledge Graph\n", "```turtle\n", "{extract_turtle(generated_kg)}\n", "```\n", "\n", "## Question\n", "{question}\n", "\n", "## Strategy\n", "Answer briefly in one line.\n", "[/INST]\n", "\"\"\".strip()\n", "\n", "generated_answer = kgllm.generate(prompt_answer)\n", "print(generated_answer)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "お盆\n" ] } ], "source": [ "final_answer = extract_answer(generated_answer)[0]\n", "print(final_answer)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ナレッジグラフ推論LLM ver.1" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Loading checkpoint shards: 100%|██████████| 39/39 [00:51<00:00, 1.32s/it]\n" ] } ], "source": [ "kgllm = Fujitsu_LLM_KG(\"../../Fujitsu-LLM-KG-8x7B_inst-infer_v1\")" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Knowledge Graph\n", "```turtle\n", "<#長岡京>\n", " rel:遷都される前の都 <#?>.\n", "\n", "<#?>\n", " rel:現在の県 <#?>;\n", " rel:現在の市 <#?>.\n", "```\n" ] } ], "source": [ "question = \"長岡京に遷都される前の都の所在地は現在の何県の何市にあたるでしょうか?\"\n", "\n", "prompt_schema = f\"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", "\n", "## Source\n", "```txt\n", "{question}\n", "```\n", "\n", "## Strategy\n", "Extract graph schema needed to answer the question in above \"Source\" as knowledge triples without omission.\n", "[/INST]\n", "\"\"\"\n", "\n", "generated_schema = kgllm.generate(prompt_schema)\n", "print(generated_schema)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Knowledge Graph\n", "```turtle\n", "<#長岡京>\n", " rel:遷都される前の都 <#平城京>.\n", "\n", "<#平城京>\n", " rel:現在の県 <#奈良県>;\n", " rel:現在の市 <#奈良市>.\n", "```\n" ] } ], "source": [ "prompt_kg = f\"\"\"\n", "[INST]\n", "Generate \"Knowledge Graph\" in RDF Turtle format based on the given \"Source\".\n", " \n", "## Source\n", "```txt\n", "平安京\n", "\n", "平安京(へいあんきょう/たいらのみやこ)または平安城(へいあんじょう)は、日本における古代最後の宮都。794年(延暦13年)から1869年(明治2年)までの日本の首都。\n", "桓武天皇により、長岡京に代わる都として山背国(山城国)愛宕・葛野の両郡にまたがる地が選ばれ、中国の洛陽城や長安城を模して793年(延暦12年)から建設された。翌794年(延暦13年)に遷都。北部中央に宮城・平安宮(大内裏)が建設され、以降歴代の皇居が置かれた。\n", "遷都以来、平清盛により断行された福原遷都(1180年)の期間を除いて、東京奠都まで1100年近くに亘って都として機能し、1869年(明治2年)まで続いた。今日の京都市街が形成されるに至る。\n", "----\n", "平城京\n", "\n", "平城京(へいじょうきょう/へいぜいきょう/ならのみやこ)は、奈良時代の日本の首都。710年に藤原京から遷都するにあたり、唐の都長安城を模倣して大和国に建造された都城。現在の奈良県奈良市、大和郡山市に存在する。\n", "中央北域に宮城・平城宮(大内裏)を置き、東西8坊 (約 4.3 km) の面積をもち、中央を南北に走る朱雀大路によって左京・右京に二分され、さらに南北・東西を大路・小路によって碁盤の目のように整然と区画され、全域が72坊に区画設定されていた。\n", "大阪湾や淡路島(八島のひとつ)にも近い奈良盆地(奈良県奈良市の西部の一部、中心部及び大和郡山市北部)には、5世紀中頃にはすでに天皇陵である佐紀盾列古墳群が作られ、またのちには大神神社、7世紀には興福寺も建立されているが、京となった8世紀には、東大寺や巨大な仏像である東大寺盧舎那仏像、法華寺などが建立された。本州の政治・文化の中心地となるに至って外京(げきょう)に位置した門前町が、今に続く奈良の町を形成する中心となった。\n", "----\n", "長岡京\n", "\n", "長岡京(ながおかきょう)は、山城国乙訓郡にあった奈良時代末期(または平安時代初期)の都城(現在の京都府向日市、長岡京市、京都市西京区)。宮域跡は向日市鶏冠井町(かいでちょう)に位置し、「長岡宮跡」として国の史跡に指定されている。\n", "延暦3年(784年)11月11日、第50代桓武天皇により平城京から遷都され、延暦13年(794年)10月22日に平安京に遷都されるまで機能した。\n", "```\n", "\n", "## Strategy\n", "Extract all verifiable facts in above \"Source\", that match the following graph schema, as knowledge triples without omission.\n", "```turtle\n", "{extract_turtle(generated_schema)}\n", "```\n", "The extracted facts are needed to answer the question \"{question}\".\n", "[/INST]\n", "\"\"\"\n", "\n", "generated_kg = kgllm.generate(prompt_kg)\n", "print(generated_kg)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ## Explore Path\n", "```path\n", "長岡京 → rel:遷都される前の都 → 平城京\n", "平城京 → rel:現在の県 → 奈良県\n", "平城京 → rel:現在の市 → 奈良市\n", "```\n", "\n", "## Answer\n", "```txt\n", "奈良県奈良市\n", "```\n" ] } ], "source": [ "prompt_answer = f\"\"\"\n", "[INST]\n", "Explore \"Knowledge Graph\" entity-to-entity then finally answer \"Question\".\n", "\n", "## Knowledge Graph\n", "```turtle\n", "{extract_turtle(generated_kg)}\n", "```\n", " \n", "## Question\n", "{question}\n", " \n", "## Strategy\n", "Answer briefly in one line.\n", "[/INST]\n", "\"\"\"\n", "\n", "generated_answer = kgllm.generate(prompt_answer)\n", "print(generated_answer)" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 2 }