Spaces:

LianHP
/

Web_page_data_html

Sleeping

Web_page_data_html / README.md

Update README.md

6942eab verified 2 months ago

2.25 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

title: Web_page_data_html
app_file: app.py
sdk: gradio
sdk_version: 5.47.2

Web Company Data Extractor

This Hugging Face demo extracts basic company-related information from unstructured web pages. You enter a URL, and the app:

The output includes:

This is a rule-based prototype for demonstration purposes. It does not replace professional-grade web data extraction or parsing libraries.

This app demonstrates how unstructured web pages can be converted into structured company data. It is useful for illustrating:

Uses:

Outputs are plain Python dictionaries and strings to avoid serialization issues.

pip install -r requirements.txt
python app.py

The app will run locally on a Gradio-generated URL.

This is not a complete production parser. It does not handle:

However, it demonstrates the core concept: turning unstructured web data into structured company data signals.