Spaces:
Sleeping
Sleeping
Update pages/6_Semi_structured_data.py
Browse files
pages/6_Semi_structured_data.py
CHANGED
|
@@ -427,4 +427,26 @@ elif file_type == "JSON":
|
|
| 427 |
|
| 428 |
''')
|
| 429 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 430 |
|
|
|
|
| 427 |
|
| 428 |
''')
|
| 429 |
|
| 430 |
+
elif file_type == "HTML":
|
| 431 |
+
st.title("HTML")
|
| 432 |
+
st.markdown('''
|
| 433 |
+
- HTML **(Hypertext Markup Language)**
|
| 434 |
+
- HTML (HyperText Markup Language) is the standard language used to create and structure content on the web, using tags to define elements such as text, images, links, and other multimedia.
|
| 435 |
+
''')
|
| 436 |
+
st.subheader("How to read and get the tabular data from the URLs?...")
|
| 437 |
+
st.code('''import pandas as pd
|
| 438 |
+
data = pd.read_html("https://en.wikipedia.org/wiki/Indian_Premier_League")
|
| 439 |
+
data
|
| 440 |
+
''')
|
| 441 |
+
st.markdown('''
|
| 442 |
+
- It gives all the tables related to Indian_Premier_League
|
| 443 |
+
- But if we want to get one particular table amongst all tables we need to give unique word related to that particular table we needed
|
| 444 |
+
''')
|
| 445 |
+
st.code('''import pandas as pd
|
| 446 |
+
data = pd.read_html("https://en.wikipedia.org/wiki/Indian_Premier_League",match="Mitchell Starc")
|
| 447 |
+
data
|
| 448 |
+
''')
|
| 449 |
+
st.matkdown('''
|
| 450 |
+
- It gives the particular table which has the word matching as "Mitchell Starc"
|
| 451 |
+
''')
|
| 452 |
|