{ "cells": [ { "cell_type": "markdown", "id": "e26838fe", "metadata": {}, "source": [ "# 📝 Multilingual Text Summarization (French + English)\n", "\n", "## 📘 Context\n", "\n", "Text summarization is a crucial NLP task used to extract key insights from long documents. With the advancement of transformer-based architectures like BART and T5, we can now generate high-quality summaries in different languages.\n", "\n", "This notebook demonstrates how to perform automatic summarization in:\n", "- 🇬🇧 **English**, using `facebook/bart-large-cnn`\n", "- 🇫🇷 **French**, using `plguillou/t5-base-fr-sum-cnndm`\n", "\n", "## 🎯 Objectives\n", "\n", "- Load and compare language-specific summarization models\n", "- Generate and display summaries for both English and French input texts\n", "- Test edge cases and observe model behavior" ] }, { "cell_type": "markdown", "id": "32637c47", "metadata": {}, "source": [ "## Packages" ] }, { "cell_type": "code", "execution_count": 1, "id": "682e051c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: transformers in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from -r requirements.txt (line 1)) (4.51.3)\n", "Collecting torch\n", " Downloading torch-2.7.0-cp39-cp39-win_amd64.whl (212.4 MB)\n", " -------------------------------------- 212.4/212.4 MB 1.5 MB/s eta 0:00:00\n", "Collecting langdetect\n", " Downloading langdetect-1.0.9.tar.gz (981 kB)\n", " ------------------------------------- 981.5/981.5 kB 12.5 MB/s eta 0:00:00\n", " Preparing metadata (setup.py): started\n", " Preparing metadata (setup.py): finished with status 'done'\n", "Collecting gradio\n", " Downloading gradio-4.44.1-py3-none-any.whl (18.1 MB)\n", " --------------------------------------- 18.1/18.1 MB 13.1 MB/s eta 0:00:00\n", "Requirement already satisfied: numpy>=1.17 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (1.21.5)\n", "Requirement already satisfied: safetensors>=0.4.3 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (0.5.3)\n", "Requirement already satisfied: tqdm>=4.27 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (4.64.1)\n", "Requirement already satisfied: packaging>=20.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (21.3)\n", "Requirement already satisfied: filelock in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (3.6.0)\n", "Requirement already satisfied: requests in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (2.28.1)\n", "Requirement already satisfied: pyyaml>=5.1 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (6.0)\n", "Requirement already satisfied: huggingface-hub<1.0,>=0.30.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (0.30.2)\n", "Requirement already satisfied: tokenizers<0.22,>=0.21 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (0.21.1)\n", "Requirement already satisfied: regex!=2019.12.17 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from transformers->-r requirements.txt (line 1)) (2022.7.9)\n", "Requirement already satisfied: jinja2 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from torch->-r requirements.txt (line 2)) (2.11.3)\n", "Requirement already satisfied: networkx in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from torch->-r requirements.txt (line 2)) (2.8.4)\n", "Requirement already satisfied: fsspec in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from torch->-r requirements.txt (line 2)) (2025.3.2)\n", "Collecting typing-extensions>=4.10.0\n", " Downloading typing_extensions-4.13.2-py3-none-any.whl (45 kB)\n", " ---------------------------------------- 45.8/45.8 kB ? eta 0:00:00\n", "Collecting sympy>=1.13.3\n", " Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)\n", " ---------------------------------------- 6.3/6.3 MB 14.9 MB/s eta 0:00:00\n", "Requirement already satisfied: six in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from langdetect->-r requirements.txt (line 3)) (1.16.0)\n", "Collecting tomlkit==0.12.0\n", " Downloading tomlkit-0.12.0-py3-none-any.whl (37 kB)\n", "Collecting aiofiles<24.0,>=22.0\n", " Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)\n", "Requirement already satisfied: matplotlib~=3.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from gradio->-r requirements.txt (line 4)) (3.5.2)\n", "Collecting pydantic>=2.0\n", " Downloading pydantic-2.11.4-py3-none-any.whl (443 kB)\n", " -------------------------------------- 443.9/443.9 kB 6.9 MB/s eta 0:00:00\n", "Collecting semantic-version~=2.0\n", " Downloading semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)\n", "Requirement already satisfied: anyio<5.0,>=3.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from gradio->-r requirements.txt (line 4)) (3.5.0)\n", "Collecting ffmpy\n", " Downloading ffmpy-0.5.0-py3-none-any.whl (6.0 kB)\n", "Requirement already satisfied: pandas<3.0,>=1.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from gradio->-r requirements.txt (line 4)) (1.4.4)\n", "Collecting ruff>=0.2.2\n", " Downloading ruff-0.11.8-py3-none-win_amd64.whl (11.6 MB)\n", " --------------------------------------- 11.6/11.6 MB 13.1 MB/s eta 0:00:00\n", "Collecting typer<1.0,>=0.12\n", " Downloading typer-0.15.3-py3-none-any.whl (45 kB)\n", " ---------------------------------------- 45.3/45.3 kB ? eta 0:00:00\n", "Collecting pydub\n", " Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)\n", "Collecting python-multipart>=0.0.9\n", " Downloading python_multipart-0.0.20-py3-none-any.whl (24 kB)\n", "Collecting fastapi<1.0\n", " Downloading fastapi-0.115.12-py3-none-any.whl (95 kB)\n", " ---------------------------------------- 95.2/95.2 kB 5.3 MB/s eta 0:00:00\n", "Requirement already satisfied: pillow<11.0,>=8.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from gradio->-r requirements.txt (line 4)) (9.2.0)\n", "Requirement already satisfied: markupsafe~=2.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from gradio->-r requirements.txt (line 4)) (2.0.1)\n", "Collecting orjson~=3.0\n", " Downloading orjson-3.10.18-cp39-cp39-win_amd64.whl (134 kB)\n", " -------------------------------------- 134.5/134.5 kB 4.0 MB/s eta 0:00:00\n", "Collecting httpx>=0.24.1\n", " Downloading httpx-0.28.1-py3-none-any.whl (73 kB)\n", " ---------------------------------------- 73.5/73.5 kB ? eta 0:00:00\n", "Collecting importlib-resources<7.0,>=1.3\n", " Downloading importlib_resources-6.5.2-py3-none-any.whl (37 kB)\n", "Collecting uvicorn>=0.14.0\n", " Downloading uvicorn-0.34.2-py3-none-any.whl (62 kB)\n", " ---------------------------------------- 62.5/62.5 kB 3.5 MB/s eta 0:00:00\n", "Collecting gradio-client==1.3.0\n", " Downloading gradio_client-1.3.0-py3-none-any.whl (318 kB)\n", " ------------------------------------- 318.7/318.7 kB 19.3 MB/s eta 0:00:00\n", "Collecting urllib3~=2.0\n", " Downloading urllib3-2.4.0-py3-none-any.whl (128 kB)\n", " ---------------------------------------- 128.7/128.7 kB ? eta 0:00:00\n", "Collecting websockets<13.0,>=10.0\n", " Downloading websockets-12.0-cp39-cp39-win_amd64.whl (124 kB)\n", " ---------------------------------------- 125.0/125.0 kB ? eta 0:00:00\n", "Requirement already satisfied: idna>=2.8 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from anyio<5.0,>=3.0->gradio->-r requirements.txt (line 4)) (3.3)\n", "Requirement already satisfied: sniffio>=1.1 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from anyio<5.0,>=3.0->gradio->-r requirements.txt (line 4)) (1.2.0)\n", "Collecting starlette<0.47.0,>=0.40.0\n", " Downloading starlette-0.46.2-py3-none-any.whl (72 kB)\n", " ---------------------------------------- 72.0/72.0 kB 3.9 MB/s eta 0:00:00\n", "Collecting httpcore==1.*\n", " Downloading httpcore-1.0.9-py3-none-any.whl (78 kB)\n", " ---------------------------------------- 78.8/78.8 kB ? eta 0:00:00\n", "Requirement already satisfied: certifi in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from httpx>=0.24.1->gradio->-r requirements.txt (line 4)) (2022.9.14)\n", "Collecting h11>=0.16\n", " Downloading h11-0.16.0-py3-none-any.whl (37 kB)\n", "Requirement already satisfied: zipp>=3.1.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from importlib-resources<7.0,>=1.3->gradio->-r requirements.txt (line 4)) (3.8.0)\n", "Requirement already satisfied: pyparsing>=2.2.1 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from matplotlib~=3.0->gradio->-r requirements.txt (line 4)) (3.0.9)\n", "Requirement already satisfied: fonttools>=4.22.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from matplotlib~=3.0->gradio->-r requirements.txt (line 4)) (4.25.0)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from matplotlib~=3.0->gradio->-r requirements.txt (line 4)) (1.4.2)\n", "Requirement already satisfied: python-dateutil>=2.7 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from matplotlib~=3.0->gradio->-r requirements.txt (line 4)) (2.8.2)\n", "Requirement already satisfied: cycler>=0.10 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from matplotlib~=3.0->gradio->-r requirements.txt (line 4)) (0.11.0)\n", "Requirement already satisfied: pytz>=2020.1 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from pandas<3.0,>=1.0->gradio->-r requirements.txt (line 4)) (2022.1)\n", "Collecting pydantic-core==2.33.2\n", " Downloading pydantic_core-2.33.2-cp39-cp39-win_amd64.whl (2.0 MB)\n", " ---------------------------------------- 2.0/2.0 MB 41.4 MB/s eta 0:00:00\n", "Collecting annotated-types>=0.6.0\n", " Downloading annotated_types-0.7.0-py3-none-any.whl (13 kB)\n", "Collecting typing-inspection>=0.4.0\n", " Downloading typing_inspection-0.4.0-py3-none-any.whl (14 kB)\n", "Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from sympy>=1.13.3->torch->-r requirements.txt (line 2)) (1.2.1)\n", "Requirement already satisfied: colorama in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from tqdm>=4.27->transformers->-r requirements.txt (line 1)) (0.4.5)\n", "Requirement already satisfied: click>=8.0.0 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from typer<1.0,>=0.12->gradio->-r requirements.txt (line 4)) (8.0.4)\n", "Collecting shellingham>=1.3.0\n", " Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB)\n", "Collecting rich>=10.11.0\n", " Downloading rich-14.0.0-py3-none-any.whl (243 kB)\n", " ------------------------------------- 243.2/243.2 kB 14.6 MB/s eta 0:00:00\n", "Requirement already satisfied: charset-normalizer<3,>=2 in c:\\users\\issa kabore\\anaconda3\\lib\\site-packages (from requests->transformers->-r requirements.txt (line 1)) (2.0.4)\n", "Collecting requests\n", " Downloading requests-2.32.3-py3-none-any.whl (64 kB)\n", " ---------------------------------------- 64.9/64.9 kB ? eta 0:00:00\n", "Collecting pygments<3.0.0,>=2.13.0\n", " Downloading pygments-2.19.1-py3-none-any.whl (1.2 MB)\n", " ---------------------------------------- 1.2/1.2 MB 38.0 MB/s eta 0:00:00\n", "Collecting markdown-it-py>=2.2.0\n", " Downloading markdown_it_py-3.0.0-py3-none-any.whl (87 kB)\n", " ---------------------------------------- 87.5/87.5 kB ? eta 0:00:00\n", "Collecting anyio<5.0,>=3.0\n", " Downloading anyio-4.9.0-py3-none-any.whl (100 kB)\n", " ---------------------------------------- 100.9/100.9 kB ? eta 0:00:00\n", "Collecting exceptiongroup>=1.0.2\n", " Downloading exceptiongroup-1.2.2-py3-none-any.whl (16 kB)\n", "Collecting mdurl~=0.1\n", " Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)\n", "Building wheels for collected packages: langdetect\n", " Building wheel for langdetect (setup.py): started\n", " Building wheel for langdetect (setup.py): finished with status 'done'\n", " Created wheel for langdetect: filename=langdetect-1.0.9-py3-none-any.whl size=993225 sha256=b37ed7c002d96ce87bc295fafe28e5a92854509e09bf0d18175aae952cdcd3ea\n", " Stored in directory: c:\\users\\issa kabore\\appdata\\local\\pip\\cache\\wheels\\d1\\c1\\d9\\7e068de779d863bc8f8fc9467d85e25cfe47fa5051fff1a1bb\n", "Successfully built langdetect\n", "Installing collected packages: pydub, websockets, urllib3, typing-extensions, tomlkit, sympy, shellingham, semantic-version, ruff, python-multipart, pygments, orjson, mdurl, langdetect, importlib-resources, h11, ffmpy, exceptiongroup, annotated-types, aiofiles, uvicorn, typing-inspection, torch, requests, pydantic-core, markdown-it-py, httpcore, anyio, starlette, rich, pydantic, httpx, typer, gradio-client, fastapi, gradio\n", " Attempting uninstall: urllib3\n", " Found existing installation: urllib3 1.26.11\n", " Uninstalling urllib3-1.26.11:\n", " Successfully uninstalled urllib3-1.26.11\n", " Attempting uninstall: typing-extensions\n", " Found existing installation: typing_extensions 4.3.0\n", " Uninstalling typing_extensions-4.3.0:\n", " Successfully uninstalled typing_extensions-4.3.0\n", " Attempting uninstall: tomlkit\n", " Found existing installation: tomlkit 0.11.1\n", " Uninstalling tomlkit-0.11.1:\n", " Successfully uninstalled tomlkit-0.11.1\n", " Attempting uninstall: sympy\n", " Found existing installation: sympy 1.10.1\n", " Uninstalling sympy-1.10.1:\n", " Successfully uninstalled sympy-1.10.1\n", " Attempting uninstall: pygments\n", " Found existing installation: Pygments 2.11.2\n", " Uninstalling Pygments-2.11.2:\n", " Successfully uninstalled Pygments-2.11.2\n", " Attempting uninstall: requests\n", " Found existing installation: requests 2.28.1\n", " Uninstalling requests-2.28.1:\n", " Successfully uninstalled requests-2.28.1\n", " Attempting uninstall: anyio\n", " Found existing installation: anyio 3.5.0\n", " Uninstalling anyio-3.5.0:\n", " Successfully uninstalled anyio-3.5.0\n", "Successfully installed aiofiles-23.2.1 annotated-types-0.7.0 anyio-4.9.0 exceptiongroup-1.2.2 fastapi-0.115.12 ffmpy-0.5.0 gradio-4.44.1 gradio-client-1.3.0 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 importlib-resources-6.5.2 langdetect-1.0.9 markdown-it-py-3.0.0 mdurl-0.1.2 orjson-3.10.18 pydantic-2.11.4 pydantic-core-2.33.2 pydub-0.25.1 pygments-2.19.1 python-multipart-0.0.20 requests-2.32.3 rich-14.0.0 ruff-0.11.8 semantic-version-2.10.0 shellingham-1.5.4 starlette-0.46.2 sympy-1.14.0 tomlkit-0.12.0 torch-2.7.0 typer-0.15.3 typing-extensions-4.13.2 typing-inspection-0.4.0 urllib3-2.4.0 uvicorn-0.34.2 websockets-12.0\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", "spyder 5.2.2 requires pyqt5<5.13, which is not installed.\n", "spyder 5.2.2 requires pyqtwebengine<5.13, which is not installed.\n", "jupyter-server 1.18.1 requires anyio<4,>=3.1.0, but you have anyio 4.9.0 which is incompatible.\n", "conda-repo-cli 1.0.24 requires clyent==1.2.1, but you have clyent 1.2.2 which is incompatible.\n", "conda-repo-cli 1.0.24 requires nbformat==5.4.0, but you have nbformat 5.5.0 which is incompatible.\n", "conda-repo-cli 1.0.24 requires requests==2.28.1, but you have requests 2.32.3 which is incompatible.\n", "botocore 1.27.28 requires urllib3<1.27,>=1.25.4, but you have urllib3 2.4.0 which is incompatible.\n" ] } ], "source": [ "# !pip install transformers sentencepiece\n", "!pip install -r requirements.txt" ] }, { "cell_type": "code", "execution_count": 2, "id": "b7763e5e", "metadata": {}, "outputs": [], "source": [ "# import loguru\n", "\n", "from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM\n", "import textwrap # Text wrapping and filling\n", "\n", "import gradio as gr\n", "from langdetect import detect" ] }, { "cell_type": "code", "execution_count": null, "id": "80f389cd", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }