Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
|
@@ -215,7 +215,7 @@ st.markdown(
|
|
| 215 |
"Inspired by Andrea Volpini's [work on content chunking](https://www.linkedin.com/pulse/understanding-chunking-google-ai-mode-practical-content-volpini-zseaf/)")
|
| 216 |
st.info(
|
| 217 |
"""
|
| 218 |
-
**How Layout-Based Chunking is Implemented Here**
|
| 219 |
This app uses a sophisticated, two-step process to create meaningful chunks based on the document's visual and semantic structure:
|
| 220 |
1. **Structural Preservation (HTML → Markdown):**
|
| 221 |
The code first converts the webpage's HTML into Markdown. This is a critical step because it translates structural tags (`<h1>`, `<p>`, `<ul>`) into their Markdown equivalents (`#`, paragraph breaks, `*`). This preserves the document's original layout and hierarchy.
|
|
|
|
| 215 |
"Inspired by Andrea Volpini's [work on content chunking](https://www.linkedin.com/pulse/understanding-chunking-google-ai-mode-practical-content-volpini-zseaf/)")
|
| 216 |
st.info(
|
| 217 |
"""
|
| 218 |
+
**How Layout-Based Chunking is Implemented Here**
|
| 219 |
This app uses a sophisticated, two-step process to create meaningful chunks based on the document's visual and semantic structure:
|
| 220 |
1. **Structural Preservation (HTML → Markdown):**
|
| 221 |
The code first converts the webpage's HTML into Markdown. This is a critical step because it translates structural tags (`<h1>`, `<p>`, `<ul>`) into their Markdown equivalents (`#`, paragraph breaks, `*`). This preserves the document's original layout and hierarchy.
|