hwang2006 commited on
Commit
20e57f9
·
verified ·
1 Parent(s): 291f622

Upload summarymaker files incluidng src, examples, etc

Browse files
assets/flask_gui.png ADDED
assets/gradio_gui.png ADDED
assets/gradio_gui_2.png ADDED
examples/test.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is a test article. It contains multiple sentences that we want to summarize. The text should be long enough to generate a meaningful summary.
2
+
3
+ Text summarization is an important task in natural language processing (NLP). It involves the creation of a shortened version of a text document while preserving its essential information and overall meaning. There are two main types of text summarization: extractive and abstractive.
4
+
5
+ Extractive summarization involves selecting key sentences or phrases directly from the original text and combining them to form a summary. This method relies on identifying the most important parts of the text and is relatively straightforward to implement. However, the resulting summary may not always be coherent or flow naturally, as it is simply a collection of extracted sentences.
6
+
7
+ On the other hand, abstractive summarization generates new sentences that convey the main ideas of the original text. This method requires a deeper understanding of the text and the ability to generate natural language that captures the essence of the content. Abstractive summarization is more challenging but can produce more cohesive and readable summaries.
8
+
9
+ In recent years, advancements in machine learning and deep learning have significantly improved the performance of text summarization models. Transformer-based models, such as BERT, GPT-3, and T5, have demonstrated remarkable capabilities in generating high-quality summaries. These models are trained on large datasets and leverage attention mechanisms to understand the context and relationships between words in a text.
10
+
11
+ Despite these advancements, text summarization remains a complex task, with challenges such as handling long documents, maintaining factual accuracy, and avoiding redundancy. Researchers continue to explore new techniques and approaches to address these challenges and enhance the effectiveness of summarization systems.
12
+
13
+ Overall, text summarization has a wide range of applications, including news aggregation, content curation, document summarization, and more. As technology continues to evolve, we can expect further improvements in the quality and efficiency of summarization methods, making it easier to distill valuable information from vast amounts of text.
examples/test_article.md ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ The Rise of Artificial Intelligence in Healthcare
2
+
3
+ Artificial intelligence has emerged as a transformative force in modern healthcare, revolutionizing everything from diagnostic procedures to patient care management. In recent years, healthcare providers and institutions worldwide have increasingly adopted AI-powered solutions to enhance their services and improve patient outcomes. The integration of AI technologies has not only streamlined administrative tasks but has also enabled more accurate disease detection and personalized treatment plans.
4
+
5
+ One of the most significant applications of AI in healthcare is in medical imaging analysis. Machine learning algorithms can now process X-rays, MRIs, and CT scans with remarkable accuracy, often detecting subtle abnormalities that human radiologists might miss. These AI systems have been particularly successful in identifying early signs of cancer, cardiovascular diseases, and neurological disorders. For example, studies have shown that AI-powered mammogram analysis can detect breast cancer with an accuracy rate comparable to, and sometimes exceeding, that of experienced radiologists.
6
+
7
+ The implementation of AI in predictive healthcare has also shown promising results. By analyzing vast amounts of patient data, AI systems can identify patterns and risk factors that might indicate potential health issues before they become severe. This predictive capability allows healthcare providers to intervene early, potentially preventing serious medical conditions and reducing the overall cost of healthcare. Hospitals using these systems have reported significant improvements in patient outcomes and reductions in readmission rates.
8
+
9
+ Electronic health records (EHRs) have been another area where AI has made substantial contributions. Natural language processing algorithms can now efficiently parse through thousands of medical records, extracting relevant information and identifying patterns that might be clinically significant. This capability has not only improved the quality of patient care but has also facilitated medical research by making vast amounts of clinical data more accessible and analyzable.
10
+
11
+ In the pharmaceutical industry, AI has accelerated the drug discovery process dramatically. Machine learning models can analyze molecular structures and predict their potential therapeutic effects, significantly reducing the time and cost associated with developing new medications. This has been particularly evident during global health crises, where AI-powered systems have helped identify potential treatments by analyzing existing drugs for new applications.
12
+
13
+ Despite these advancements, the integration of AI in healthcare faces several challenges. Privacy concerns regarding patient data, the need for regulatory frameworks, and questions about the reliability of AI systems in critical medical decisions remain important issues to address. Healthcare providers must also invest in training their staff to work effectively alongside AI systems, ensuring that these technologies enhance rather than replace human medical expertise.
14
+
15
+ The economic implications of AI in healthcare are substantial. While the initial investment in AI technologies can be significant, the long-term benefits often justify the cost. Improved efficiency, reduced medical errors, and better patient outcomes can lead to significant cost savings for healthcare institutions. Studies suggest that AI applications in healthcare could result in annual savings of billions of dollars across the industry.
16
+
17
+ Looking ahead, the role of AI in healthcare is expected to expand further. Emerging technologies like quantum computing could enhance AI capabilities, enabling even more sophisticated medical applications. Personalized medicine, powered by AI analysis of genetic and environmental factors, could become the standard approach to treatment. Additionally, AI-powered robotic surgery systems continue to evolve, promising greater precision and improved outcomes in surgical procedures.
18
+
19
+ Human oversight remains crucial in the implementation of AI in healthcare. While these systems can process vast amounts of data and identify patterns more efficiently than humans, medical professionals must ultimately make the final decisions regarding patient care. This partnership between human expertise and artificial intelligence represents the future of healthcare, where technology enhances rather than replaces the critical role of healthcare providers.
20
+
21
+ As we move forward, continued research and development in AI healthcare applications will likely reveal new possibilities for improving patient care. The key to successful implementation lies in striking the right balance between technological innovation and human medical expertise, ensuring that AI serves as a tool to enhance healthcare delivery while maintaining the essential human element in medical care.
examples/test_article.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ The Rise of Artificial Intelligence in Healthcare
2
+ Artificial intelligence has emerged as a transformative force in modern healthcare, revolutionizing everything from diagnostic procedures to patient care management. In recent years, healthcare providers and institutions worldwide have increasingly adopted AI-powered solutions to enhance their services and improve patient outcomes. The integration of AI technologies has not only streamlined administrative tasks but has also enabled more accurate disease detection and personalized treatment plans.
3
+ One of the most significant applications of AI in healthcare is in medical imaging analysis. Machine learning algorithms can now process X-rays, MRIs, and CT scans with remarkable accuracy, often detecting subtle abnormalities that human radiologists might miss. These AI systems have been particularly successful in identifying early signs of cancer, cardiovascular diseases, and neurological disorders. For example, studies have shown that AI-powered mammogram analysis can detect breast cancer with an accuracy rate comparable to, and sometimes exceeding, that of experienced radiologists.
4
+ The implementation of AI in predictive healthcare has also shown promising results. By analyzing vast amounts of patient data, AI systems can identify patterns and risk factors that might indicate potential health issues before they become severe. This predictive capability allows healthcare providers to intervene early, potentially preventing serious medical conditions and reducing the overall cost of healthcare. Hospitals using these systems have reported significant improvements in patient outcomes and reductions in readmission rates.
5
+ Electronic health records (EHRs) have been another area where AI has made substantial contributions. Natural language processing algorithms can now efficiently parse through thousands of medical records, extracting relevant information and identifying patterns that might be clinically significant. This capability has not only improved the quality of patient care but has also facilitated medical research by making vast amounts of clinical data more accessible and analyzable.
6
+ In the pharmaceutical industry, AI has accelerated the drug discovery process dramatically. Machine learning models can analyze molecular structures and predict their potential therapeutic effects, significantly reducing the time and cost associated with developing new medications. This has been particularly evident during global health crises, where AI-powered systems have helped identify potential treatments by analyzing existing drugs for new applications.
7
+ Despite these advancements, the integration of AI in healthcare faces several challenges. Privacy concerns regarding patient data, the need for regulatory frameworks, and questions about the reliability of AI systems in critical medical decisions remain important issues to address. Healthcare providers must also invest in training their staff to work effectively alongside AI systems, ensuring that these technologies enhance rather than replace human medical expertise.
8
+ The economic implications of AI in healthcare are substantial. While the initial investment in AI technologies can be significant, the long-term benefits often justify the cost. Improved efficiency, reduced medical errors, and better patient outcomes can lead to significant cost savings for healthcare institutions. Studies suggest that AI applications in healthcare could result in annual savings of billions of dollars across the industry.
9
+ Looking ahead, the role of AI in healthcare is expected to expand further. Emerging technologies like quantum computing could enhance AI capabilities, enabling even more sophisticated medical applications. Personalized medicine, powered by AI analysis of genetic and environmental factors, could become the standard approach to treatment. Additionally, AI-powered robotic surgery systems continue to evolve, promising greater precision and improved outcomes in surgical procedures.
10
+ Human oversight remains crucial in the implementation of AI in healthcare. While these systems can process vast amounts of data and identify patterns more efficiently than humans, medical professionals must ultimately make the final decisions regarding patient care. This partnership between human expertise and artificial intelligence represents the future of healthcare, where technology enhances rather than replaces the critical role of healthcare providers.
11
+ As we move forward, continued research and development in AI healthcare applications will likely reveal new possibilities for improving patient care. The key to successful implementation lies in striking the right balance between technological innovation and human medical expertise, ensuring that AI serves as a tool to enhance healthcare delivery while maintaining the essential human element in medical care.
src/summarizer/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Text summarization package."""
src/summarizer/cli.py ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import click
2
+ from .summarizer import process_text
3
+ from .utils import extract_from_url, read_file
4
+ import warnings
5
+
6
+ #warnings.filterwarnings("ignore")
7
+ #warnings.filterwarnings("ignore", module="torch")
8
+ #warnings.filterwarnings("ignore", module="numpy")
9
+
10
+ @click.command()
11
+ @click.option('--url', help='URL to extract text from')
12
+ @click.option('--file', help='Text file path to summarize', type=click.Path(exists=True))
13
+ @click.option('--model', default='t5-base', help='Transformer model to use')
14
+ @click.option('--max-length', default=180, help='Maximum length of summary')
15
+ def main(url, file, model, max_length):
16
+ """Summarize text from a URL or file."""
17
+ try:
18
+ if url:
19
+ click.echo(f"Fetching text from URL: {url}")
20
+ text = extract_from_url(url)
21
+ elif file:
22
+ click.echo(f"Reading file: {file}")
23
+ text = read_file(file)
24
+ else:
25
+ raise click.UsageError("Please provide either --url or --file")
26
+
27
+ if not text or len(text.strip()) < 50:
28
+ raise click.UsageError("Not enough text content to summarize")
29
+
30
+ click.echo("Starting summarization process...")
31
+ summary = process_text(text, model=model, max_length=max_length)
32
+ click.echo("\nSummary:")
33
+ click.echo("=" * 80)
34
+ click.echo(summary)
35
+ click.echo("=" * 80)
36
+
37
+ except Exception as e:
38
+ click.echo(f"Error: {str(e)}", err=True)
39
+ raise click.Abort()
40
+
41
+ if __name__ == "__main__":
42
+ main()
src/summarizer/summarizer.py ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from transformers import pipeline
2
+ import os
3
+
4
+ os.environ['TF_CPP_MIN_LOG_LEVEL'] = "3"
5
+
6
+ def process_text(text, model="t5-base", max_length=180):
7
+ """
8
+ Process and summarize the input text.
9
+
10
+ Args:
11
+ text (str): Input text to summarize
12
+ model (str): Name of the transformer model to use
13
+ max_length (int): Maximum length of the summary
14
+
15
+ Returns:
16
+ str: Summarized text
17
+ """
18
+ try:
19
+ summarizer = pipeline("summarization", model=model)
20
+ result = summarizer(text, max_length=max_length)
21
+ return result[0]["summary_text"]
22
+ except Exception as e:
23
+ raise Exception(f"Summarization failed: {str(e)}")
src/summarizer/tests.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
src/summarizer/utils.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+ from bs4 import BeautifulSoup
3
+ import time
4
+
5
+ def read_file(file_path):
6
+ """
7
+ Read text content from a file.
8
+
9
+ Args:
10
+ file_path (str): Path to the text file
11
+
12
+ Returns:
13
+ str: File content
14
+ """
15
+ try:
16
+ with open(file_path, 'r', encoding='utf-8') as f:
17
+ content = f.read().strip()
18
+ if not content:
19
+ raise Exception("File is empty")
20
+ return content
21
+ except UnicodeDecodeError:
22
+ # Try with different encodings if utf-8 fails
23
+ try:
24
+ with open(file_path, 'r', encoding='latin-1') as f:
25
+ content = f.read().strip()
26
+ if not content:
27
+ raise Exception("File is empty")
28
+ return content
29
+ except Exception as e:
30
+ raise Exception(f"Failed to read file with alternative encoding: {str(e)}")
31
+ except Exception as e:
32
+ raise Exception(f"File reading failed: {str(e)}")
33
+
34
+ def extract_from_url(url):
35
+ """
36
+ Extract text content from a URL.
37
+
38
+ Args:
39
+ url (str): URL to extract text from
40
+
41
+ Returns:
42
+ str: Extracted text content
43
+ """
44
+ try:
45
+ headers = {
46
+ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
47
+ }
48
+
49
+ # Add retry mechanism
50
+ max_retries = 3
51
+ for attempt in range(max_retries):
52
+ try:
53
+ response = requests.get(url, headers=headers, timeout=10)
54
+ response.raise_for_status()
55
+ break
56
+ except requests.RequestException as e:
57
+ if attempt == max_retries - 1:
58
+ raise
59
+ time.sleep(1)
60
+
61
+ soup = BeautifulSoup(response.text, 'html.parser')
62
+ # Try to get text from articles first
63
+ article_text = ""
64
+ articles = soup.find_all(['article', 'main'])
65
+ if articles:
66
+ for article in articles:
67
+ paragraphs = article.find_all("p")
68
+ article_text += " ".join(p.text.strip() for p in paragraphs if p.text.strip())
69
+
70
+ # If no article text found, fall back to all paragraphs
71
+ if not article_text:
72
+ paragraphs = soup.find_all("p")
73
+ article_text = " ".join(p.text.strip() for p in paragraphs if p.text.strip())
74
+
75
+ if not article_text:
76
+ raise Exception("No text content found on the page")
77
+
78
+ return article_text
79
+ except requests.RequestException as e:
80
+ raise Exception(f"Failed to fetch URL: {str(e)}")
81
+ except Exception as e:
82
+ raise Exception(f"URL extraction failed: {str(e)}")
src/summarizer/webapp/app.py ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, request, render_template
2
+ from summarizer.summarizer import process_text # Adjust import path
3
+ from summarizer.utils import extract_from_url, read_file # Adjust import path
4
+ import logging
5
+ import os
6
+
7
+ # Set up logging
8
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
9
+
10
+ app = Flask(__name__)
11
+
12
+ # Limit file upload size to 1 MB
13
+ app.config['MAX_CONTENT_LENGTH'] = 1 * 1024 * 1024
14
+
15
+ @app.route('/')
16
+ def index():
17
+ # Render the template with an empty summary by default
18
+ return render_template('index.html', summary="")
19
+
20
+ @app.route('/summarize', methods=['POST'])
21
+ def summarize():
22
+ try:
23
+ choice = request.form.get('choice')
24
+ url = request.form.get('url')
25
+ file = request.files.get('file')
26
+ text = request.form.get('text')
27
+ model = request.form.get('model') or 't5-base'
28
+ max_length = request.form.get('max_length')
29
+
30
+ # Validate max_length
31
+ try:
32
+ max_length = int(max_length) if max_length else 180
33
+ if max_length <= 0:
34
+ raise ValueError("Max length must be positive.")
35
+ except ValueError:
36
+ return render_template('index.html', error="Invalid maximum length", summary="")
37
+
38
+ # Ensure only one input is provided
39
+ if (choice == 'url' and not url) or (choice == 'file' and not file) or (choice == 'text' and not text):
40
+ return render_template('index.html', error="Please provide the selected input type.", summary="")
41
+
42
+ input_text = ""
43
+ if choice == 'url':
44
+ if not url.startswith(('http://', 'https://')):
45
+ return render_template('index.html', error="Invalid URL format.", summary="")
46
+ try:
47
+ input_text = extract_from_url(url)
48
+ except Exception as e:
49
+ logging.error(f"URL extraction failed: {str(e)}")
50
+ return render_template('index.html', error="URL extraction failed.", summary="")
51
+ elif choice == 'file':
52
+ if not file.filename.endswith('.txt'):
53
+ return render_template('index.html', error="Only .txt files are supported.", summary="")
54
+ try:
55
+ input_text = file.read().decode('utf-8')
56
+ except Exception as e:
57
+ logging.error(f"File reading failed: {str(e)}")
58
+ return render_template('index.html', error="File reading failed.", summary="")
59
+ elif choice == 'text':
60
+ input_text = text
61
+
62
+ if not input_text or len(input_text.strip()) < 50:
63
+ return render_template('index.html', error="Not enough text content to summarize", summary="")
64
+
65
+ try:
66
+ summary = process_text(input_text, model=model, max_length=max_length)
67
+ except Exception as e:
68
+ logging.error(f"Summarization failed: {str(e)}")
69
+ return render_template('index.html', error="Summarization failed.", summary="")
70
+
71
+ return render_template('index.html', summary=summary, url=url, model=model, max_length=max_length, text=text)
72
+
73
+ except Exception as e:
74
+ logging.error(f"Unexpected error: {str(e)}")
75
+ return render_template('index.html', error="An unexpected error occurred.", summary="")
76
+
77
+ if __name__ == '__main__':
78
+ # Use a secure production-ready WSGI server for deployment, e.g., Gunicorn
79
+ #app.run(debug=True)
80
+ app.run(host="0.0.0.0", port=5000)
src/summarizer/webapp/app.py.bak ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, request, render_template
2
+ from summarizer.summarizer import process_text # Adjust import path
3
+ from summarizer.utils import extract_from_url, read_file # Adjust import path
4
+
5
+ app = Flask(__name__)
6
+
7
+ @app.route('/')
8
+ def index():
9
+ return render_template('index.html')
10
+
11
+ @app.route('/summarize', methods=['POST'])
12
+ def summarize():
13
+ if request.method == 'POST':
14
+ choice = request.form.get('choice')
15
+ url = request.form.get('url')
16
+ file = request.files.get('file')
17
+ text = request.form.get('text')
18
+ model = request.form.get('model') or 't5-base'
19
+ max_length = request.form.get('max_length')
20
+
21
+ # Use default max_length if the field is empty
22
+ if not max_length:
23
+ max_length = 180
24
+ else:
25
+ # Convert max_length to integer if it's not empty
26
+ try:
27
+ max_length = int(max_length)
28
+ except ValueError:
29
+ return render_template('index.html', error="Invalid maximum length")
30
+
31
+ # Ensure only one input is provided based on the choice
32
+ if (choice == 'url' and not url) or (choice == 'file' and not file) or (choice == 'text' and not text):
33
+ return render_template('index.html', error="Please provide the selected input type.")
34
+
35
+ input_text = ""
36
+ if choice == 'url':
37
+ try:
38
+ input_text = extract_from_url(url)
39
+ except Exception as e:
40
+ return render_template('index.html', error=f"URL extraction failed: {str(e)}")
41
+ elif choice == 'file':
42
+ try:
43
+ input_text = file.read().decode('utf-8')
44
+ except Exception as e:
45
+ return render_template('index.html', error=f"File reading failed: {str(e)}")
46
+ elif choice == 'text':
47
+ input_text = text
48
+
49
+ if not input_text or len(input_text.strip()) < 50:
50
+ return render_template('index.html', error="Not enough text content to summarize")
51
+
52
+ try:
53
+ summary = process_text(input_text, model=model, max_length=max_length)
54
+ except Exception as e:
55
+ return render_template('index.html', error=f"Summarization failed: {str(e)}")
56
+
57
+ return render_template('index.html', summary=summary, url=url, model=model, max_length=max_length, text=text)
58
+
59
+ return render_template('index.html')
60
+
61
+ if __name__ == '__main__':
62
+ app.run(debug=True)
src/summarizer/webapp/app.py.bak2 ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, request, render_template
2
+ from summarizer.summarizer import process_text # Adjust import path
3
+ from summarizer.utils import extract_from_url, read_file # Adjust import path
4
+ import logging
5
+ import os
6
+
7
+ # Set up logging
8
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
9
+
10
+ app = Flask(__name__)
11
+
12
+ # Limit file upload size to 1 MB
13
+ app.config['MAX_CONTENT_LENGTH'] = 1 * 1024 * 1024
14
+
15
+ @app.route('/')
16
+ def index():
17
+ return render_template('index.html')
18
+
19
+ @app.route('/summarize', methods=['POST'])
20
+ def summarize():
21
+ try:
22
+ choice = request.form.get('choice')
23
+ url = request.form.get('url')
24
+ file = request.files.get('file')
25
+ text = request.form.get('text')
26
+ model = request.form.get('model') or 't5-base'
27
+ max_length = request.form.get('max_length')
28
+
29
+ # Validate max_length
30
+ try:
31
+ max_length = int(max_length) if max_length else 180
32
+ if max_length <= 0:
33
+ raise ValueError("Max length must be positive.")
34
+ except ValueError:
35
+ return render_template('index.html', error="Invalid maximum length")
36
+
37
+ # Ensure only one input is provided
38
+ if (choice == 'url' and not url) or (choice == 'file' and not file) or (choice == 'text' and not text):
39
+ return render_template('index.html', error="Please provide the selected input type.")
40
+
41
+ input_text = ""
42
+ if choice == 'url':
43
+ if not url.startswith(('http://', 'https://')):
44
+ return render_template('index.html', error="Invalid URL format.")
45
+ try:
46
+ input_text = extract_from_url(url)
47
+ except Exception as e:
48
+ logging.error(f"URL extraction failed: {str(e)}")
49
+ return render_template('index.html', error="URL extraction failed.")
50
+ elif choice == 'file':
51
+ if not file.filename.endswith('.txt'):
52
+ return render_template('index.html', error="Only .txt files are supported.")
53
+ try:
54
+ input_text = file.read().decode('utf-8')
55
+ except Exception as e:
56
+ logging.error(f"File reading failed: {str(e)}")
57
+ return render_template('index.html', error="File reading failed.")
58
+ elif choice == 'text':
59
+ input_text = text
60
+
61
+ if not input_text or len(input_text.strip()) < 50:
62
+ return render_template('index.html', error="Not enough text content to summarize")
63
+
64
+ try:
65
+ summary = process_text(input_text, model=model, max_length=max_length)
66
+ except Exception as e:
67
+ logging.error(f"Summarization failed: {str(e)}")
68
+ return render_template('index.html', error="Summarization failed.")
69
+
70
+ return render_template('index.html', summary=summary, url=url, model=model, max_length=max_length, text=text)
71
+
72
+ except Exception as e:
73
+ logging.error(f"Unexpected error: {str(e)}")
74
+ return render_template('index.html', error="An unexpected error occurred.")
75
+
76
+ if __name__ == '__main__':
77
+ # Use a secure production-ready WSGI server for deployment, e.g., Gunicorn
78
+ app.run(debug=True)
79
+
80
+ # Updated HTML Template (index.html):
81
+ # 1. Provide a dropdown menu for model selection.
82
+ # 2. Style the page for better UX.
83
+ # 3. Ensure accessibility improvements with ARIA roles and labels.
84
+
85
+ # Note: Additional details for index.html updates can be provided upon request.
src/summarizer/webapp/gradio_app.py ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import gradio as gr
3
+ from summarizer.summarizer import process_text # Adjust import path
4
+ #from summarizer import process_text # Adjust import path
5
+ from summarizer.utils import extract_from_url # Adjust import path
6
+ #from utils import extract_from_url # Adjust import path
7
+
8
+ # Set the Gradio temporary directory
9
+ os.environ['GRADIO_TEMP_DIR'] = os.path.expanduser('~/.gradio_tmp')
10
+
11
+ # Create the temporary directory if it does not exist
12
+ os.makedirs(os.environ['GRADIO_TEMP_DIR'], exist_ok=True)
13
+
14
+ def summarize_text(choice, url, file_path, text, model_name, max_length):
15
+ input_text = ""
16
+ if choice == "URL":
17
+ try:
18
+ input_text = extract_from_url(url)
19
+ except Exception as e:
20
+ return f"URL extraction failed: {str(e)}"
21
+ elif choice == "File":
22
+ if file_path is not None:
23
+ try:
24
+ with open(file_path.name, 'r', encoding='utf-8') as f:
25
+ input_text = f.read()
26
+ except Exception as e:
27
+ return f"File reading failed: {str(e)}"
28
+ else:
29
+ return "File reading failed: No file uploaded"
30
+ elif choice == "Text":
31
+ input_text = text
32
+
33
+ if not input_text or len(input_text.strip()) < 50:
34
+ return "Not enough text content to summarize"
35
+
36
+ try:
37
+ summary = process_text(input_text, model=model_name, max_length=max_length)
38
+ return summary
39
+ except Exception as e:
40
+ return f"Summarization failed: {str(e)}"
41
+
42
+ def update_visibility(choice):
43
+ return (
44
+ gr.update(visible=(choice == "URL"), value=""),
45
+ gr.update(visible=(choice == "File"), value=None),
46
+ gr.update(visible=(choice == "Text"), value="")
47
+ )
48
+
49
+ def main():
50
+ choices = ["Text", "URL", "File"]
51
+ with gr.Blocks() as demo:
52
+ gr.Markdown("# SummaryMaker") # Add title here
53
+ choice = gr.Dropdown(choices, label="Choose input text type", value="Text")
54
+ url = gr.Textbox(label="URL to Summarize", visible=False)
55
+ file = gr.File(label="Upload File", visible=False)
56
+ text = gr.Textbox(label="Text to Summarize", lines=10, visible=True) # Visible by default
57
+ model = gr.Textbox(label="Model", value="t5-base")
58
+ max_length = gr.Slider(label="Max Length", minimum=50, maximum=500, value=180, step=10)
59
+ summary = gr.Textbox(label="Summary")
60
+
61
+ choice.change(fn=update_visibility, inputs=choice, outputs=[url, file, text])
62
+
63
+ gr.Button("Summarize").click(
64
+ summarize_text,
65
+ inputs=[choice, url, file, text, model, max_length],
66
+ outputs=[summary]
67
+ )
68
+
69
+ #demo.launch()
70
+ # Ensure the Gradio app binds to '0.0.0.0' to be accessible from outside the container
71
+ demo.launch(server_name="0.0.0.0", server_port=7860)
72
+
73
+ if __name__ == "__main__":
74
+ main()
src/summarizer/webapp/templates/index.html ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>SummaryMaker</title>
7
+ <script>
8
+ function toggleInput() {
9
+ const choice = document.getElementById('choice').value;
10
+ const urlInput = document.getElementById('urlInput');
11
+ const fileInput = document.getElementById('fileInput');
12
+ const textInput = document.getElementById('textInput');
13
+
14
+ // Clear the summary box when a new input type is selected
15
+ document.getElementById('summary').value = '';
16
+
17
+ if (choice === 'url') {
18
+ urlInput.style.display = 'block';
19
+ fileInput.style.display = 'none';
20
+ textInput.style.display = 'none';
21
+ document.getElementById('file').value = "";
22
+ document.getElementById('text').value = "";
23
+ } else if (choice === 'file') {
24
+ urlInput.style.display = 'none';
25
+ fileInput.style.display = 'block';
26
+ textInput.style.display = 'none';
27
+ document.getElementById('url').value = "";
28
+ document.getElementById('text').value = "";
29
+ } else if (choice === 'text') {
30
+ urlInput.style.display = 'none';
31
+ fileInput.style.display = 'none';
32
+ textInput.style.display = 'block';
33
+ document.getElementById('url').value = "";
34
+ document.getElementById('file').value = "";
35
+ } else {
36
+ urlInput.style.display = 'none';
37
+ fileInput.style.display = 'none';
38
+ textInput.style.display = 'none';
39
+ }
40
+ }
41
+
42
+
43
+ function clearSummary() {
44
+ document.getElementById('summary').value = '';
45
+ }
46
+ </script>
47
+ </head>
48
+ <body>
49
+ <h1>SummaryMaker</h1>
50
+ <form action="/summarize" method="post" enctype="multipart/form-data" onsubmit="clearSummary()">
51
+ <label for="choice">Choose input text type:</label><br>
52
+ <select id="choice" name="choice" onchange="toggleInput()">
53
+ <option value="">--Select--</option>
54
+ <option value="url">URL</option>
55
+ <option value="file">File</option>
56
+ <option value="text">Text</option>
57
+ </select><br><br>
58
+
59
+ <div id="urlInput" style="display: none;">
60
+ <label for="url">URL to Summarize:</label><br>
61
+ <input type="text" name="url" id="url" value="{{ url }}"><br><br>
62
+ </div>
63
+
64
+ <div id="fileInput" style="display: none;">
65
+ <label for="file">Upload File:</label><br>
66
+ <input type="file" name="file" id="file"><br><br>
67
+ </div>
68
+
69
+ <div id="textInput" style="display: none;">
70
+ <label for="text">Text to Summarize:</label><br>
71
+ <textarea name="text" id="text" rows="10" cols="50">{{ text }}</textarea><br><br>
72
+ </div>
73
+
74
+ <label for="model">Model:</label>
75
+ <input type="text" name="model" id="model" value="{{ model or 't5-base' }}"><br>
76
+
77
+ <label for="max_length">Max Length:</label>
78
+ <input type="number" name="max_length" id="max_length" value="{{ max_length or 180 }}"><br><br>
79
+
80
+ <input type="submit" value="Summarize">
81
+ </form>
82
+ {% if error %}
83
+ <p style="color: red;">{{ error }}</p>
84
+ {% endif %}
85
+ <div>
86
+ <h2>Summary:</h2>
87
+ <textarea id="summary" rows="10" cols="50" readonly>{{ summary }}</textarea>
88
+ </div>
89
+ </body>
90
+ </html>
src/summarizer/webapp/templates/index.html.bak ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>SummaryMaker</title>
7
+ <script>
8
+ function toggleInput() {
9
+ const choice = document.getElementById('choice').value;
10
+ const urlInput = document.getElementById('urlInput');
11
+ const fileInput = document.getElementById('fileInput');
12
+ const textInput = document.getElementById('textInput');
13
+ if (choice === 'url') {
14
+ urlInput.style.display = 'block';
15
+ fileInput.style.display = 'none';
16
+ textInput.style.display = 'none';
17
+ document.getElementById('file').value = "";
18
+ document.getElementById('text').value = "";
19
+ } else if (choice === 'file') {
20
+ urlInput.style.display = 'none';
21
+ fileInput.style.display = 'block';
22
+ textInput.style.display = 'none';
23
+ document.getElementById('url').value = "";
24
+ document.getElementById('text').value = "";
25
+ } else if (choice === 'text') {
26
+ urlInput.style.display = 'none';
27
+ fileInput.style.display = 'none';
28
+ textInput.style.display = 'block';
29
+ document.getElementById('url').value = "";
30
+ document.getElementById('file').value = "";
31
+ } else {
32
+ urlInput.style.display = 'none';
33
+ fileInput.style.display = 'none';
34
+ textInput.style.display = 'none';
35
+ }
36
+ }
37
+
38
+ function clearSummary() {
39
+ document.getElementById('summary').value = '';
40
+ }
41
+ </script>
42
+ </head>
43
+ <body>
44
+ <h1>SummaryMaker</h1>
45
+ <form action="/summarize" method="post" enctype="multipart/form-data" onsubmit="clearSummary()">
46
+ <label for="choice">Choose input text type:</label><br>
47
+ <select id="choice" name="choice" onchange="toggleInput()">
48
+ <option value="">--Select--</option>
49
+ <option value="url">URL</option>
50
+ <option value="file">File</option>
51
+ <option value="text">Text</option>
52
+ </select><br><br>
53
+
54
+ <div id="urlInput" style="display: none;">
55
+ <label for="url">URL to Summarize:</label><br>
56
+ <input type="text" name="url" id="url" value="{{ url }}"><br><br>
57
+ </div>
58
+
59
+ <div id="fileInput" style="display: none;">
60
+ <label for="file">Upload File:</label><br>
61
+ <input type="file" name="file" id="file"><br><br>
62
+ </div>
63
+
64
+ <div id="textInput" style="display: none;">
65
+ <label for="text">Text to Summarize:</label><br>
66
+ <textarea name="text" id="text" rows="10" cols="50">{{ text }}</textarea><br><br>
67
+ </div>
68
+
69
+ <label for="model">Model:</label>
70
+ <input type="text" name="model" id="model" value="{{ model or 't5-base' }}"><br>
71
+
72
+ <label for="max_length">Max Length:</label>
73
+ <input type="number" name="max_length" id="max_length" value="{{ max_length or 180 }}"><br><br>
74
+
75
+ <input type="submit" value="Summarize">
76
+ </form>
77
+ {% if error %}
78
+ <p style="color: red;">{{ error }}</p>
79
+ {% endif %}
80
+ <div>
81
+ {% if summary %}
82
+ <h2>Summary:</h2>
83
+ <textarea id="summary" rows="10" cols="50" readonly>{{ summary }}</textarea>
84
+ {% endif %}
85
+ </div>
86
+ </body>
87
+ </html>
src/summarizer/webapp/templates/index.html.bak2 ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>SummaryMaker</title>
7
+ <script>
8
+ function toggleInput() {
9
+ const choice = document.getElementById('choice').value;
10
+ const urlInput = document.getElementById('urlInput');
11
+ const fileInput = document.getElementById('fileInput');
12
+ const textInput = document.getElementById('textInput');
13
+ if (choice === 'url') {
14
+ urlInput.style.display = 'block';
15
+ fileInput.style.display = 'none';
16
+ textInput.style.display = 'none';
17
+ document.getElementById('file').value = "";
18
+ document.getElementById('text').value = "";
19
+ } else if (choice === 'file') {
20
+ urlInput.style.display = 'none';
21
+ fileInput.style.display = 'block';
22
+ textInput.style.display = 'none';
23
+ document.getElementById('url').value = "";
24
+ document.getElementById('text').value = "";
25
+ } else if (choice === 'text') {
26
+ urlInput.style.display = 'none';
27
+ fileInput.style.display = 'none';
28
+ textInput.style.display = 'block';
29
+ document.getElementById('url').value = "";
30
+ document.getElementById('file').value = "";
31
+ } else {
32
+ urlInput.style.display = 'none';
33
+ fileInput.style.display = 'none';
34
+ textInput.style.display = 'none';
35
+ }
36
+ }
37
+
38
+ function clearSummary() {
39
+ document.getElementById('summary').value = '';
40
+ }
41
+ </script>
42
+ </head>
43
+ <body>
44
+ <h1>SummaryMaker</h1>
45
+ <form action="/summarize" method="post" enctype="multipart/form-data" onsubmit="clearSummary()">
46
+ <label for="choice">Choose input text type:</label><br>
47
+ <select id="choice" name="choice" onchange="toggleInput()">
48
+ <option value="">--Select--</option>
49
+ <option value="url">URL</option>
50
+ <option value="file">File</option>
51
+ <option value="text">Text</option>
52
+ </select><br><br>
53
+
54
+ <div id="urlInput" style="display: none;">
55
+ <label for="url">URL to Summarize:</label><br>
56
+ <input type="text" name="url" id="url" value="{{ url }}"><br><br>
57
+ </div>
58
+
59
+ <div id="fileInput" style="display: none;">
60
+ <label for="file">Upload File:</label><br>
61
+ <input type="file" name="file" id="file"><br><br>
62
+ </div>
63
+
64
+ <div id="textInput" style="display: none;">
65
+ <label for="text">Text to Summarize:</label><br>
66
+ <textarea name="text" id="text" rows="10" cols="50">{{ text }}</textarea><br><br>
67
+ </div>
68
+
69
+ <label for="model">Model:</label>
70
+ <input type="text" name="model" id="model" value="{{ model or 't5-base' }}"><br>
71
+
72
+ <label for="max_length">Max Length:</label>
73
+ <input type="number" name="max_length" id="max_length" value="{{ max_length or 180 }}"><br><br>
74
+
75
+ <input type="submit" value="Summarize">
76
+ </form>
77
+ {% if error %}
78
+ <p style="color: red;">{{ error }}</p>
79
+ {% endif %}
80
+ <div>
81
+ <h2>Summary:</h2>
82
+ <textarea id="summary" rows="10" cols="50" readonly>{{ summary }}</textarea>
83
+ </div>
84
+ </body>
85
+ </html>
tests/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ #Empty file
tests/conftest.py ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pytest
2
+ import tempfile
3
+ import os
4
+
5
+ @pytest.fixture
6
+ def sample_text():
7
+ return """
8
+ Artificial intelligence has emerged as a transformative force in modern healthcare,
9
+ revolutionizing everything from diagnostic procedures to patient care management.
10
+ In recent years, healthcare providers and institutions worldwide have increasingly
11
+ adopted AI-powered solutions to enhance their services and improve patient outcomes.
12
+ """
13
+
14
+ @pytest.fixture
15
+ def sample_text_file(sample_text):
16
+ with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.txt') as f:
17
+ f.write(sample_text)
18
+ yield f.name
19
+ os.unlink(f.name)
tests/test_cli.py ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from click.testing import CliRunner
2
+ #from summarizer.cli import main
3
+ from summarizer.cli import main
4
+
5
+ def test_cli_with_file(sample_text_file, sample_text, mocker):
6
+ # If using: from .summarizer import process_text in cli.py
7
+ mock_process = mocker.patch('summarizer.cli.process_text')
8
+ mock_process.return_value = "Summarized text"
9
+
10
+ runner = CliRunner()
11
+ result = runner.invoke(main, ['--file', sample_text_file])
12
+
13
+ #print("CLI Output:\n", result.output) # Print the output for debugging
14
+ #print("sample text:\n", sample_text)
15
+
16
+ assert result.exit_code == 0
17
+ assert "Summarized text" in result.output
18
+ mock_process.assert_called_once_with(sample_text.strip(), model="t5-base", max_length=180 )
19
+
20
+ def test_cli_with_url(mocker):
21
+ #mock_extract = mocker.patch('summarizer.utils.extract_from_url')
22
+ #mock_process = mocker.patch('summarizer.summarizer.process_text')
23
+ mock_extract = mocker.patch('summarizer.cli.extract_from_url')
24
+ mock_process = mocker.patch('summarizer.cli.process_text')
25
+
26
+ mock_extract.return_value ="""
27
+ This domain is for use in illustrative examples in documents. You may use this
28
+ domain in literature without prior coordination or asking for permission. More information...
29
+ """
30
+ mock_process.return_value = "Summarized text"
31
+
32
+ runner = CliRunner()
33
+ result = runner.invoke(main, ['--url', 'http://example.com'])
34
+ #result = runner.invoke(main, ['--url', 'https://en.wikipedia.org/wiki/Seoul'])
35
+
36
+ #print("CLI Output:\n", result.output) # Print the output for debugging
37
+ #result.output = """
38
+ #Fetching text from URL: http://example.com
39
+ #Starting summarization process...
40
+ #
41
+ #Summary:
42
+ #================================================================================
43
+ #Summarized text
44
+ #================================================================================
45
+ #"""
46
+
47
+ assert result.exit_code == 0
48
+ assert "Summarized text" in result.output
49
+
50
+ mock_extract.assert_called_once_with('http://example.com')
51
+ #mock_extract.assert_called_once_with('https://en.wikipedia.org/wiki/Seoul')
52
+ #mock_process.assert_called_once_with("Extracted text", model='t5-base', max_length=180)
53
+ mock_process.assert_called_once_with(mock_extract.return_value, model='t5-base', max_length=180)
54
+
55
+ def test_cli_no_input():
56
+ runner = CliRunner()
57
+ result = runner.invoke(main, [])
58
+
59
+ assert result.exit_code != 0
60
+ assert "Please provide either --url or --file" in result.output
61
+
62
+ def test_cli_invalid_file():
63
+ runner = CliRunner()
64
+ result = runner.invoke(main, ['--file', 'nonexistent.txt'])
65
+
66
+ assert result.exit_code != 0
67
+ assert "Error" in result.output
tests/test_example.py ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ def test_example(sample_text_file):
2
+ with open(sample_text_file, 'r') as f:
3
+ content = f.read()
4
+ assert "Artificial intelligence" in content
tests/test_summarizer.py ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pytest
2
+ #from'summarizer.summarizer import process_text
3
+ from summarizer.summarizer import process_text
4
+
5
+
6
+ def test_process_text_success(mocker, sample_text):
7
+ """
8
+ When you create a pipeline, it's a two-step process:
9
+
10
+ # Step 1: Create the pipeline
11
+ summarizer = pipeline("summarization", model="t5-base")
12
+ # Step 2: Use the pipeline
13
+ summary = summarizer(text)
14
+
15
+ # This works because it matches Step 1 - creating the pipeline
16
+ mock_pipeline.assert_called_once_with("summarization", model="t5-base")
17
+
18
+ # This doesn't work because it's trying to assert Step 2
19
+ mock_pipeline.assert_called_once_with(sample_text, model="t5-base")
20
+
21
+ """
22
+ # If using: from transformers import pipeline in summarizer.py
23
+ # This works because it matches Step 1 - creating the pipeline
24
+ mock_pipeline = mocker.patch('summarizer.summarizer.pipeline')
25
+
26
+ # If using: import transformers
27
+ #mock_pipeline = mocker.patch('summarizer.summarizer.transformers.pipeline')
28
+
29
+
30
+ mock_summarizer = mock_pipeline.return_value
31
+ mock_summarizer.return_value = [{'summary_text': 'Test summary'}]
32
+ #mock_pipeline.return_value.return_value = [{'summary_text': 'Test summary'}]
33
+
34
+ result = process_text(sample_text.strip())
35
+
36
+ #print("result: ", result) #for debugging purpose
37
+ assert result == 'Test summary'
38
+ mock_pipeline.assert_called_once_with("summarization", model="t5-base")
39
+ mock_summarizer.assert_called_once_with(sample_text.strip(), max_length=180)
40
+ #mock_pipeline.assert_called_once_with(sample_text, model="t5-base")
41
+
42
+ def test_process_text_with_custom_model(mocker, sample_text):
43
+ mock_pipeline = mocker.patch('summarizer.summarizer.pipeline')
44
+ mock_summarizer = mock_pipeline.return_value
45
+ mock_summarizer.return_value = [{'summary_text': 'Test summary'}]
46
+
47
+ custom_model = "t5-small"
48
+ result = process_text(sample_text.strip(), model=custom_model)
49
+
50
+ print(result) # print out result for debugging purpose
51
+
52
+ assert result == 'Test summary'
53
+ #mock_pipeline.assert_called_once_with("summarization", model=custom_model)
54
+ mock_summarizer.assert_called_once_with(sample_text.strip(), max_length=180)
55
+
56
+ def test_process_text_failure(mocker, sample_text):
57
+ mock_pipeline = mocker.patch('summarizer.summarizer.pipeline')
58
+ mock_summarizer = mock_pipeline.return_value
59
+ mock_summarizer.return_value = [{'summary_text': 'Test summary'}]
60
+ mock_pipeline.side_effect = Exception("Model error")
61
+
62
+ with pytest.raises(Exception) as exc_info:
63
+ process_text(sample_text.strip())
64
+
65
+ print("Exception String: ", str(exc_info.value)) # for debugging purpose
66
+ assert "Summarization failed" in str(exc_info.value)
tests/test_utils.py ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import tempfile
3
+ import pytest
4
+ from summarizer.utils import read_file, extract_from_url
5
+ import requests
6
+
7
+ def test_read_file_success(sample_text_file, sample_text):
8
+ content = read_file(sample_text_file)
9
+ assert content.strip() == sample_text.strip()
10
+
11
+ def test_read_file_nonexistent():
12
+ with pytest.raises(Exception) as exc_info:
13
+ read_file("nonexistent_file.txt")
14
+ assert "File reading failed" in str(exc_info.value)
15
+
16
+ def test_read_file_empty():
17
+ with tempfile.NamedTemporaryFile(mode='w', delete=False) as f:
18
+ pass
19
+ try:
20
+ with pytest.raises(Exception) as exc_info:
21
+ read_file(f.name)
22
+ assert "File is empty" in str(exc_info.value)
23
+ finally:
24
+ os.unlink(f.name)
25
+
26
+ def test_extract_from_url(requests_mock):
27
+ url = "http://example.com"
28
+ mock_html = """
29
+ <html>
30
+ <body>
31
+ <article>
32
+ <p>First paragraph.</p>
33
+ <p>Second paragraph.</p>
34
+ </article>
35
+ </body>
36
+ </html>
37
+ """
38
+ requests_mock.get(url, text=mock_html)
39
+ content = extract_from_url(url)
40
+ assert "First paragraph. Second paragraph." in content
41
+
42
+ def test_extract_from_url_no_content(requests_mock):
43
+ url = "http://example.com"
44
+ mock_html = "<html><body></body></html>"
45
+ requests_mock.get(url, text=mock_html)
46
+ with pytest.raises(Exception) as exc_info:
47
+ extract_from_url(url)
48
+ assert "No text content found" in str(exc_info.value)
49
+
50
+ def test_extract_from_url_connection_error(requests_mock):
51
+ url = "http://example.com"
52
+ requests_mock.get(url, exc=requests.exceptions.ConnectionError)
53
+ with pytest.raises(Exception) as exc_info:
54
+ extract_from_url(url)
55
+ assert "Failed to fetch URL" in str(exc_info.value)