File size: 2,854 Bytes
6231986
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
```python
# First, we will need to install the required packages if not already installed.
# You can do this by running: pip install Flask beautifulsoup4 requests

from flask import Flask, request, jsonify
from bs4 import BeautifulSoup
import requests

# Create Flask app
app = Flask(__name__)

# Define a route for the micro service to scrape URLs
@app.route('/scrape', methods=['POST'])
def scrape():
    # Get JSON data from the request
    data = request.get_json()
    url = data.get('url')
    
    if not url:
        return jsonify({"error": "URL not provided"}), 400
    
    try:
        # Send a GET request to the URL
        response = requests.get(url)
        response.raise_for_status()  # Raise an error for bad responses

        # Parse the content using BeautifulSoup
        soup = BeautifulSoup(response.content, 'html.parser')
        
        # Extract title of the page
        title = soup.title.string if soup.title else 'No title found'
        
        # Extract all paragraphs from the page
        paragraphs = [p.get_text() for p in soup.find_all('p')]
        
        # Prepare the output
        result = {
            "url": url,
            "title": title,
            "paragraphs": paragraphs
        }

        return jsonify(result), 200

    except requests.exceptions.RequestException as e:
        # Handle request exceptions
        return jsonify({"error": str(e)}), 500

# Run the application
if __name__ == '__main__':
    app.run(debug=True)
```

### Instructions to Run and Test the Microservice

1. **Save the code** in a file named `scrape_service.py`.
2. **Install required packages** (if not already installed) by running:
   ```
   pip install Flask beautifulsoup4 requests
   ```
3. **Run the Flask application**:
   ```
   python scrape_service.py
   ```
   The application will start at `http://127.0.0.1:5000/`.

4. **Test the service** with a POST request. You can use `curl` from the terminal or a tool like Postman. Here’s an example using `curl`:

   ```bash
   curl -X POST http://127.0.0.1:5000/scrape -H "Content-Type: application/json" -d '{"url": "https://www.example.com"}'
   ```

5. **Expected Output**: The service will respond with a JSON object containing the URL, the title of the page, and the paragraphs.

   Example response:
   ```json
   {
      "url": "https://www.example.com",
      "title": "Example Domain",
      "paragraphs": [
           "This domain is for use in illustrative examples...",
           "More information..."
       ]
   }
   ```

This microservice uses Flask to accept a POST request with a URL, scrapes that URL using BeautifulSoup, and returns the page title and paragraphs as a JSON response. 

The service is designed to be robust, handling HTTP errors gracefully and returning appropriate error messages if the input URL is missing or invalid.