Spaces:

Harika22
/

Machine_learning

Sleeping

Harika22 commited on Dec 15, 2024

Commit

750e1d6

verified ·

1 Parent(s): 95a82db

Update pages/6_Semi_structured_data.py

Files changed (1) hide show

pages/6_Semi_structured_data.py CHANGED Viewed

@@ -374,6 +374,26 @@ elif file_type == "JSON":
     st.code('''
     output = '{"columns":["name","age","weight"],"index":[0,1,2],"data":[["harii",21,34],["sree",24,45],["gowtham",25,67]]}'
     ''')

     st.code('''
     output = '{"columns":["name","age","weight"],"index":[0,1,2],"data":[["harii",21,34],["sree",24,45],["gowtham",25,67]]}'
     ''')
+    st.subheader("**Issues in Structured JSON Format**")
+    st.markdown('''
+    - As in structured json format it reads only string format when the data is in heterogenous like dictionry of dictionary and list of dictionary we can't use pd.json_normalize()
+    - To handle this issue we use semi-structured json format which can handle nested structures
+    ''')
+    st.header("Semi-structured JSON Format")
+    st.markdown('''
+    - A semi-structured JSON format lacks a fixed schema, allowing irregular or nested structures
+    - It takes list of dictionaries where each dict will be acting as a single row
+    - Semi-structured json format has different types to convert dataframe into json
+    - ◆ max_level ---> how much deeper it takes to take the values of column
+    - ◆ record_path ---> only used when values are in list of dictionary
+    - ◆ meta ---> it is used to get remaining columns
+    ''')
+    st.subheader("How to read Semi-structured JSON Format?...")
+    st.code('''import pandas as pd
+    b = {"name":"a","marks":{"sem1":{"maths":22,"science":23},"sem2":{"maths":24,"science":25}}}
+    pd.json_normalize(b)
+    ''')