python_coder / coder /output /code_and_output.tx
efayguo's picture
Upload folder using huggingface_hub
6231986 verified
raw
history blame
2.85 kB
```python
# First, we will need to install the required packages if not already installed.
# You can do this by running: pip install Flask beautifulsoup4 requests
from flask import Flask, request, jsonify
from bs4 import BeautifulSoup
import requests
# Create Flask app
app = Flask(__name__)
# Define a route for the micro service to scrape URLs
@app.route('/scrape', methods=['POST'])
def scrape():
# Get JSON data from the request
data = request.get_json()
url = data.get('url')
if not url:
return jsonify({"error": "URL not provided"}), 400
try:
# Send a GET request to the URL
response = requests.get(url)
response.raise_for_status() # Raise an error for bad responses
# Parse the content using BeautifulSoup
soup = BeautifulSoup(response.content, 'html.parser')
# Extract title of the page
title = soup.title.string if soup.title else 'No title found'
# Extract all paragraphs from the page
paragraphs = [p.get_text() for p in soup.find_all('p')]
# Prepare the output
result = {
"url": url,
"title": title,
"paragraphs": paragraphs
}
return jsonify(result), 200
except requests.exceptions.RequestException as e:
# Handle request exceptions
return jsonify({"error": str(e)}), 500
# Run the application
if __name__ == '__main__':
app.run(debug=True)
```
### Instructions to Run and Test the Microservice
1. **Save the code** in a file named `scrape_service.py`.
2. **Install required packages** (if not already installed) by running:
```
pip install Flask beautifulsoup4 requests
```
3. **Run the Flask application**:
```
python scrape_service.py
```
The application will start at `http://127.0.0.1:5000/`.
4. **Test the service** with a POST request. You can use `curl` from the terminal or a tool like Postman. Here’s an example using `curl`:
```bash
curl -X POST http://127.0.0.1:5000/scrape -H "Content-Type: application/json" -d '{"url": "https://www.example.com"}'
```
5. **Expected Output**: The service will respond with a JSON object containing the URL, the title of the page, and the paragraphs.
Example response:
```json
{
"url": "https://www.example.com",
"title": "Example Domain",
"paragraphs": [
"This domain is for use in illustrative examples...",
"More information..."
]
}
```
This microservice uses Flask to accept a POST request with a URL, scrapes that URL using BeautifulSoup, and returns the page title and paragraphs as a JSON response.
The service is designed to be robust, handling HTTP errors gracefully and returning appropriate error messages if the input URL is missing or invalid.