Spaces:
Running
Create svg html English presentation for this system
Browse files# 🎯 HỆ THỐNG XỬ LÝ TÀI LIỆU VỚI KHÔI PHỤC NGỮ CẢNH SECTION
## 📌 TỔNG QUAN HỆ THỐNG
### Mục đích
Hệ thống xử lý và lưu trữ tài liệu markdown với khả năng:
- Tách tài liệu thành các **sections** có cấu trúc
- Chunking thông minh dựa trên token
- **Khôi phục ngữ cảnh đầy đủ** của section gốc khi search
- Tìm kiếm hybrid (Vector + BM25) hiệu suất cao
### Kiến trúc tổng quan
```
┌─────────────────┐
│ FastAPI App │ ← API endpoints
│ (main.py) │
└────────┬────────┘
│
┌────▼─────────────────────────────┐
│ Celery Worker (worker.py) │
│ - Xử lý bất đồng bộ │
│ - Quản lý job queue │
└────┬─────────────────────────────┘
│
┌────▼──────────────────────────────┐
│ ProcessPipeline │
│ - Điều phối workflow │
└────┬──────────────────────────────┘
│
┌────▼──────────────────────────────┐
│ MarkdownProcessor │
│ - TextCleaner (split sections) │
│ - RaptorProcessor (chunking) │
└────┬──────────────────────────────┘
│
┌────▼──────────────────────────────┐
│ Storage Layer │
│ ┌──────────────┐ ┌─────────────┐│
│ │ PostgreSQL │ │ Milvus ││
│ │ (Sections) │ │ (Vectors) ││
│ └──────────────┘ └─────────────┘│
└───────────────────────────────────┘
```
---
## 🔄 WORKFLOW XỬ LÝ CHI TIẾT
### 1️⃣ Giai đoạn tiếp nhận (API Layer)
**Endpoint**: `POST /index-file/`
```python
# src/api/index_routes.py
async def process_extracted_file(request: MinioFileRequest)
```
**Luồng xử lý**:
1. Nhận request với danh sách file markdown từ SeaweedFS
2. Tạo job ID cho mỗi file
3. Đẩy job vào Celery queue
4. Trả về response ngay lập tức (202 Accepted)
**Input**:
```json
{
"list_path": [
{
"id": "uuid-file-1",
"path": "extracted/document.md"
}
]
}
```
---
### 2️⃣ Giai đoạn xử lý bất đồng bộ (Worker)
**File**: `worker.py`
**Task**: `process_extracted_file_task`
**Các bước**:
1. **Cập nhật trạng thái**: Publish "processing" status qua Redis
2. **Tải file**: Lấy markdown content từ SeaweedFS
3. **Xử lý nội dung**: Gọi ProcessPipeline
4. **Hoàn thành**: Publish "completed" status
```python
# worker.py (line 147-194)
@celery_app.task(bind=True, name="process_extracted_file_task")
def process_extracted_file_task(self, file_id, bucket_name,
markdown_object_name, original_file_path):
# 1. Update status
REDIS_SERVICE.publish_processing_status(file_id, original_file_path)
# 2. Retrieve markdown
markdown_content = SEAWEEDFS_SERVICE.process_file(bucket_name, output_path)
# 3. Process with pipeline
result = PROCESS_PIPELINE.process_markdown_text(
file_id=file_id,
text=markdown_content,
document_path=original_file_path
)
# 4. Publish completion
REDIS_SERVICE.publish_done_status(file_id, original_file_path)
```
---
### 3️⃣ Giai đoạn xử lý Pipeline
**File**: `src/pipelines/process_pipeline.py`
**Class**: `ProcessPipeline`
**Nhiệm vụ**: Điều phối toàn bộ quá trình xử lý
```python
async def process_markdown_text(self, file_id, text, document_path):
# Gọi MarkdownProcessor để xử lý
result = await self.markdown_processor.process_text(
markdown_text=text,
file_id=file_id,
document_path=document_path,
store_vectors=True,
use_raptor=True
)
return result
```
---
## 🎯 ĐIỂM ĐẶC BIỆT: XỬ LÝ THEO SECTION
### 4️⃣ Giai đoạn tách Section (TextCleaner)
**File**: `src/processors/cleaner.py`
**Class**: `TextCleaner`
#### 🔍 Phân tích cấu trúc tài liệu
**Method chính**: `split_into_sections()`
**Cơ chế hoạt động**:
1. **Nhận diện cấu trúc**:
- Phát hiện markdown headers (`#`, `##`, `###`, ...)
- Theo dõi page markers (`[PAGE:1]`, `[PAGE:2]`, ...)
- Xác định ranh giới giữa các sections
2. **Tạo sections có cấu trúc**:
```python
# src/processors/cleaner.py (line 42-103)
def split_into_sections(self, text, file_id, document_path):
sections = []
lines = text.splitlines()
current_page = 1
# Duyệt từng dòng
for line in lines:
# Phát hiện page marker
if page_match := re.search(self.page_marker_pattern, line):
current_page = int(page_match.group(1))
continue
# Phát hiện header
if header_match := re.match(r'^(#{1,6})\s+(.+)$', line):
header_level = len(header_match.group(1))
title = header_match.group(2).strip()
# Thu thập nội dung cho đến header tiếp theo
content_lines = []
while i < len(lines):
next_line = lines[i]
if is_next_header(next_line, header_level):
break
content_lines.append(next_line)
# Tạo section
self._create_section(sections, title, content, current_page)
return sections
```
#### 📦 Cấu trúc Section được tạo ra
**Method**: `_create_section()`
```python
# src/processors/cleaner.py (line 105-131)
def _create_section(self, sections, title, content, page):
section_id = str(uuid.uuid4()) # ← ID duy nhất cho section
content = title + "\n" + content
sections.append({
"content": content, # ← Nội dung đầy đủ của section
"metadata": {
"document_id": self.current_document_id,
"document_path": self.current_document_path,
"section_id": section_id, # ← KEY POINT: Section ID
"page_info": {
"index": page - 1,
"total": self.current_total_pages
},
"section_content": content # ← Lưu lại nội dung gốc
}
})
```
**Ví dụ section được tạo**:
```json
{
"content": "## Giới thiệu\nĐây là phần giới thiệu về hệ thống...",
"metadata": {
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"document_path": "input/document.pdf",
"section_id": "123e4567-e89b-12d3-a456-426614174000",
"page_info": {
"index": 0,
"total": 10
},
"section_content": "## Giới thiệu\nĐây là phần giới thiệu..."
}
}
```
---
### 5️⃣ Giai đoạn Chunking thông minh (RaptorProcessor)
**File**: `src/services/raptor_processor.py`
**Class**: `RaptorProcessor`
#### 🎯 Mục tiêu Chunking
**Vấn đề**: Sections có thể quá dài (vượt quá giới hạn token của embedding model)
**Giải pháp**: Chia nhỏ sections thành chunks, **nhưng vẫn giữ liên kết với section gốc**
#### 📊 Quy trình Chunking
**Method**: `create_knowledge_base()`
```python
# src/services/raptor_processor.py (line 82-157)
async def create_knowledge_base(self, sections):
# BƯỚC 1: Lưu toàn bộ sections vào PostgreSQL
for section in sections:
content = section["content"]
metadata = section["metadata"]
self.postgres_service.store_section(
section_id=metadata["section_id"], # ← Lưu với section_id
document_id=metadata["document_id"],
content=content # ← Nội dung đầy đủ của section
)
# BƯỚC 2: Chuyển sections thành LangChain Documents
documents = self._convert_sections_to_documents(sections)
# BƯỚC 3: Chunking thông minh
all_chunks = []
for doc in documents:
token_count = self.count_tokens(doc.page_content)
if token_count <= self.chunk_size:
# Section đủ nhỏ → giữ nguyên
all_chunks.append(doc)
else:
# Section quá lớn → chia nhỏ
split_docs = self.text_splitter.split_documents([doc])
all_chunks.extend(split_docs)
# BƯỚC 4: Thêm chunk_id cho mỗi chunk
for chunk in all_chunks:
chunk.metadata.update({
"chunk_id": str(uuid.uuid4()) # ← ID riêng cho chunk
})
# Lưu ý: section_id vẫn được giữ nguyên từ metadata gốc
# BƯỚC 5: Validate và lưu vào Milvus
validated_chunks = self._normalize_and_validate_chunks(all_chunks)
self.vector_storage_service.store_documents(validated_chunks)
```
#### 🔑
- README.md +8 -5
- components/footer.js +100 -0
- components/navbar.js +168 -0
- index.html +266 -19
- script.js +38 -0
- style.css +33 -19
|
@@ -1,10 +1,13 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: DocuSense Explorer 🚀
|
| 3 |
+
colorFrom: blue
|
| 4 |
+
colorTo: pink
|
| 5 |
+
emoji: 🐳
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
+
tags:
|
| 9 |
+
- deepsite-v3
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# Welcome to your new DeepSite project!
|
| 13 |
+
This project was created with [DeepSite](https://huggingface.co/deepsite).
|
|
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
class CustomFooter extends HTMLElement {
|
| 2 |
+
connectedCallback() {
|
| 3 |
+
this.attachShadow({ mode: 'open' });
|
| 4 |
+
this.shadowRoot.innerHTML = `
|
| 5 |
+
<style>
|
| 6 |
+
footer {
|
| 7 |
+
@apply bg-gray-50 dark:bg-gray-800 border-t border-gray-200 dark:border-gray-700;
|
| 8 |
+
}
|
| 9 |
+
|
| 10 |
+
.container {
|
| 11 |
+
@apply max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8;
|
| 12 |
+
}
|
| 13 |
+
|
| 14 |
+
.footer-content {
|
| 15 |
+
@apply grid grid-cols-1 md:grid-cols-3 gap-8;
|
| 16 |
+
}
|
| 17 |
+
|
| 18 |
+
.footer-section h3 {
|
| 19 |
+
@apply text-lg font-semibold text-gray-900 dark:text-white mb-4;
|
| 20 |
+
}
|
| 21 |
+
|
| 22 |
+
.footer-links {
|
| 23 |
+
@apply space-y-3;
|
| 24 |
+
}
|
| 25 |
+
|
| 26 |
+
.footer-link {
|
| 27 |
+
@apply text-gray-600 dark:text-gray-300 hover:text-gray-900 dark:hover:text-white;
|
| 28 |
+
}
|
| 29 |
+
|
| 30 |
+
.footer-bottom {
|
| 31 |
+
@apply mt-8 pt-8 border-t border-gray-200 dark:border-gray-700
|
| 32 |
+
flex flex-col md:flex-row justify-between items-center;
|
| 33 |
+
}
|
| 34 |
+
|
| 35 |
+
.copyright {
|
| 36 |
+
@apply text-gray-500 dark:text-gray-400 text-sm;
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
.social-links {
|
| 40 |
+
@apply flex space-x-4 mt-4 md:mt-0;
|
| 41 |
+
}
|
| 42 |
+
|
| 43 |
+
.social-link {
|
| 44 |
+
@apply text-gray-400 dark:text-gray-300 hover:text-gray-500 dark:hover:text-gray-200;
|
| 45 |
+
}
|
| 46 |
+
</style>
|
| 47 |
+
|
| 48 |
+
<footer>
|
| 49 |
+
<div class="container">
|
| 50 |
+
<div class="footer-content">
|
| 51 |
+
<div class="footer-section">
|
| 52 |
+
<h3>DocuSense Explorer</h3>
|
| 53 |
+
<p class="text-gray-600 dark:text-gray-300">
|
| 54 |
+
Advanced document processing with intelligent context recovery for better search and retrieval.
|
| 55 |
+
</p>
|
| 56 |
+
</div>
|
| 57 |
+
|
| 58 |
+
<div class="footer-section">
|
| 59 |
+
<h3>Resources</h3>
|
| 60 |
+
<div class="footer-links">
|
| 61 |
+
<a href="#" class="footer-link">Documentation</a>
|
| 62 |
+
<a href="#" class="footer-link">API Reference</a>
|
| 63 |
+
<a href="#" class="footer-link">GitHub Repository</a>
|
| 64 |
+
</div>
|
| 65 |
+
</div>
|
| 66 |
+
|
| 67 |
+
<div class="footer-section">
|
| 68 |
+
<h3>Contact</h3>
|
| 69 |
+
<div class="footer-links">
|
| 70 |
+
<a href="#" class="footer-link">Email Us</a>
|
| 71 |
+
<a href="#" class="footer-link">Support Portal</a>
|
| 72 |
+
<a href="#" class="footer-link">Feedback</a>
|
| 73 |
+
</div>
|
| 74 |
+
</div>
|
| 75 |
+
</div>
|
| 76 |
+
|
| 77 |
+
<div class="footer-bottom">
|
| 78 |
+
<p class="copyright">
|
| 79 |
+
© ${new Date().getFullYear()} DocuSense Explorer. All rights reserved.
|
| 80 |
+
</p>
|
| 81 |
+
|
| 82 |
+
<div class="social-links">
|
| 83 |
+
<a href="#" class="social-link">
|
| 84 |
+
<i data-feather="github"></i>
|
| 85 |
+
</a>
|
| 86 |
+
<a href="#" class="social-link">
|
| 87 |
+
<i data-feather="twitter"></i>
|
| 88 |
+
</a>
|
| 89 |
+
<a href="#" class="social-link">
|
| 90 |
+
<i data-feather="linkedin"></i>
|
| 91 |
+
</a>
|
| 92 |
+
</div>
|
| 93 |
+
</div>
|
| 94 |
+
</div>
|
| 95 |
+
</footer>
|
| 96 |
+
`;
|
| 97 |
+
}
|
| 98 |
+
}
|
| 99 |
+
|
| 100 |
+
customElements.define('custom-footer', CustomFooter);
|
|
@@ -0,0 +1,168 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
class CustomNavbar extends HTMLElement {
|
| 2 |
+
connectedCallback() {
|
| 3 |
+
this.attachShadow({ mode: 'open' });
|
| 4 |
+
this.shadowRoot.innerHTML = `
|
| 5 |
+
<style>
|
| 6 |
+
nav {
|
| 7 |
+
@apply bg-white dark:bg-gray-800 shadow-sm;
|
| 8 |
+
}
|
| 9 |
+
|
| 10 |
+
.container {
|
| 11 |
+
@apply max-w-7xl mx-auto px-4 sm:px-6 lg:px-8;
|
| 12 |
+
}
|
| 13 |
+
|
| 14 |
+
.nav-content {
|
| 15 |
+
@apply flex justify-between items-center h-16;
|
| 16 |
+
}
|
| 17 |
+
|
| 18 |
+
.logo {
|
| 19 |
+
@apply flex-shrink-0 flex items-center text-blue-600 dark:text-blue-400 font-bold text-xl;
|
| 20 |
+
}
|
| 21 |
+
|
| 22 |
+
.nav-links {
|
| 23 |
+
@apply hidden md:ml-6 md:flex md:space-x-8;
|
| 24 |
+
}
|
| 25 |
+
|
| 26 |
+
.nav-link {
|
| 27 |
+
@apply inline-flex items-center px-1 pt-1 border-b-2 border-transparent
|
| 28 |
+
text-gray-500 dark:text-gray-300 hover:text-gray-700 dark:hover:text-gray-100
|
| 29 |
+
hover:border-gray-300 dark:hover:border-gray-500 text-sm font-medium;
|
| 30 |
+
}
|
| 31 |
+
|
| 32 |
+
.nav-link.active {
|
| 33 |
+
@apply border-blue-500 text-gray-900 dark:text-white;
|
| 34 |
+
}
|
| 35 |
+
|
| 36 |
+
.mobile-menu-button {
|
| 37 |
+
@apply inline-flex items-center justify-center p-2 rounded-md
|
| 38 |
+
text-gray-400 hover:text-gray-500 dark:hover:text-gray-300
|
| 39 |
+
hover:bg-gray-100 dark:hover:bg-gray-700 focus:outline-none;
|
| 40 |
+
}
|
| 41 |
+
|
| 42 |
+
.mobile-menu {
|
| 43 |
+
@apply md:hidden;
|
| 44 |
+
}
|
| 45 |
+
|
| 46 |
+
.mobile-menu-items {
|
| 47 |
+
@apply pt-2 pb-3 space-y-1;
|
| 48 |
+
}
|
| 49 |
+
|
| 50 |
+
.mobile-menu-link {
|
| 51 |
+
@apply block pl-3 pr-4 py-2 border-l-4 border-transparent
|
| 52 |
+
text-gray-500 dark:text-gray-300 hover:text-gray-700 dark:hover:text-gray-100
|
| 53 |
+
hover:bg-gray-50 dark:hover:bg-gray-700 hover:border-gray-300
|
| 54 |
+
dark:hover:border-gray-500 text-base font-medium;
|
| 55 |
+
}
|
| 56 |
+
|
| 57 |
+
.mobile-menu-link.active {
|
| 58 |
+
@apply border-blue-500 bg-blue-50 dark:bg-blue-900/20
|
| 59 |
+
text-blue-700 dark:text-blue-300;
|
| 60 |
+
}
|
| 61 |
+
|
| 62 |
+
.theme-toggle {
|
| 63 |
+
@apply p-2 rounded-full text-gray-400 hover:text-gray-500
|
| 64 |
+
dark:hover:text-gray-300 hover:bg-gray-100 dark:hover:bg-gray-700;
|
| 65 |
+
}
|
| 66 |
+
|
| 67 |
+
.hidden {
|
| 68 |
+
display: none;
|
| 69 |
+
}
|
| 70 |
+
</style>
|
| 71 |
+
|
| 72 |
+
<nav>
|
| 73 |
+
<div class="container">
|
| 74 |
+
<div class="nav-content">
|
| 75 |
+
<div class="flex items-center">
|
| 76 |
+
<a href="/" class="logo">
|
| 77 |
+
<i data-feather="file-text" class="mr-2"></i>
|
| 78 |
+
DocuSense
|
| 79 |
+
</a>
|
| 80 |
+
</div>
|
| 81 |
+
|
| 82 |
+
<div class="nav-links">
|
| 83 |
+
<a href="#overview" class="nav-link">Overview</a>
|
| 84 |
+
<a href="#demo" class="nav-link">Workflow</a>
|
| 85 |
+
<a href="#benefits" class="nav-link">Benefits</a>
|
| 86 |
+
</div>
|
| 87 |
+
|
| 88 |
+
<div class="flex items-center space-x-2">
|
| 89 |
+
<button class="theme-toggle" onclick="toggleDarkMode()">
|
| 90 |
+
<i data-feather="moon" class="dark:hidden"></i>
|
| 91 |
+
<i data-feather="sun" class="hidden dark:block"></i>
|
| 92 |
+
</button>
|
| 93 |
+
|
| 94 |
+
<button class="mobile-menu-button md:hidden" aria-expanded="false">
|
| 95 |
+
<i data-feather="menu"></i>
|
| 96 |
+
</button>
|
| 97 |
+
</div>
|
| 98 |
+
</div>
|
| 99 |
+
</div>
|
| 100 |
+
|
| 101 |
+
<!-- Mobile menu -->
|
| 102 |
+
<div class="mobile-menu hidden">
|
| 103 |
+
<div class="container">
|
| 104 |
+
<div class="mobile-menu-items">
|
| 105 |
+
<a href="#overview" class="mobile-menu-link">Overview</a>
|
| 106 |
+
<a href="#demo" class="mobile-menu-link">Workflow</a>
|
| 107 |
+
<a href="#benefits" class="mobile-menu-link">Benefits</a>
|
| 108 |
+
</div>
|
| 109 |
+
</div>
|
| 110 |
+
</div>
|
| 111 |
+
</nav>
|
| 112 |
+
|
| 113 |
+
<script>
|
| 114 |
+
// Mobile menu toggle
|
| 115 |
+
document.querySelector('.mobile-menu-button').addEventListener('click', function() {
|
| 116 |
+
const menu = document.querySelector('.mobile-menu');
|
| 117 |
+
menu.classList.toggle('hidden');
|
| 118 |
+
|
| 119 |
+
// Toggle icon between menu and x
|
| 120 |
+
const icon = this.querySelector('i');
|
| 121 |
+
if (menu.classList.contains('hidden')) {
|
| 122 |
+
icon.setAttribute('data-feather', 'menu');
|
| 123 |
+
} else {
|
| 124 |
+
icon.setAttribute('data-feather', 'x');
|
| 125 |
+
}
|
| 126 |
+
feather.replace();
|
| 127 |
+
});
|
| 128 |
+
|
| 129 |
+
// Close mobile menu when clicking a link
|
| 130 |
+
document.querySelectorAll('.mobile-menu-link').forEach(link => {
|
| 131 |
+
link.addEventListener('click', function() {
|
| 132 |
+
document.querySelector('.mobile-menu').classList.add('hidden');
|
| 133 |
+
document.querySelector('.mobile-menu-button i').setAttribute('data-feather', 'menu');
|
| 134 |
+
feather.replace();
|
| 135 |
+
});
|
| 136 |
+
});
|
| 137 |
+
|
| 138 |
+
// Update active link based on scroll position
|
| 139 |
+
window.addEventListener('scroll', function() {
|
| 140 |
+
const sections = ['overview', 'demo', 'benefits'];
|
| 141 |
+
let currentSection = '';
|
| 142 |
+
|
| 143 |
+
sections.forEach(section => {
|
| 144 |
+
const element = document.getElementById(section);
|
| 145 |
+
if (element) {
|
| 146 |
+
const rect = element.getBoundingClientRect();
|
| 147 |
+
if (rect.top <= 100 && rect.bottom >= 100) {
|
| 148 |
+
currentSection = section;
|
| 149 |
+
}
|
| 150 |
+
}
|
| 151 |
+
});
|
| 152 |
+
|
| 153 |
+
// Update active state
|
| 154 |
+
document.querySelectorAll('.nav-link, .mobile-menu-link').forEach(link => {
|
| 155 |
+
const href = link.getAttribute('href').substring(1);
|
| 156 |
+
if (href === currentSection) {
|
| 157 |
+
link.classList.add('active');
|
| 158 |
+
} else {
|
| 159 |
+
link.classList.remove('active');
|
| 160 |
+
}
|
| 161 |
+
});
|
| 162 |
+
});
|
| 163 |
+
</script>
|
| 164 |
+
`;
|
| 165 |
+
}
|
| 166 |
+
}
|
| 167 |
+
|
| 168 |
+
customElements.define('custom-navbar', CustomNavbar);
|
|
@@ -1,19 +1,266 @@
|
|
| 1 |
-
<!
|
| 2 |
-
<html>
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
<
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<title>DocuSense Explorer - Document Processing System</title>
|
| 7 |
+
<link rel="stylesheet" href="style.css">
|
| 8 |
+
<script src="https://cdn.tailwindcss.com"></script>
|
| 9 |
+
<script src="https://cdn.jsdelivr.net/npm/feather-icons/dist/feather.min.js"></script>
|
| 10 |
+
<script src="https://unpkg.com/feather-icons"></script>
|
| 11 |
+
<script src="components/navbar.js"></script>
|
| 12 |
+
<script src="components/footer.js"></script>
|
| 13 |
+
</head>
|
| 14 |
+
<body class="bg-gray-50 dark:bg-gray-900 min-h-screen flex flex-col">
|
| 15 |
+
<custom-navbar></custom-navbar>
|
| 16 |
+
|
| 17 |
+
<main class="flex-grow container mx-auto px-4 py-8">
|
| 18 |
+
<div class="max-w-5xl mx-auto">
|
| 19 |
+
<!-- Hero Section -->
|
| 20 |
+
<section class="mb-16 text-center">
|
| 21 |
+
<div class="bg-gradient-to-r from-blue-600 to-indigo-700 text-white p-8 rounded-xl shadow-2xl">
|
| 22 |
+
<h1 class="text-4xl md:text-5xl font-bold mb-4">Document Processing <br>with Context Recovery</h1>
|
| 23 |
+
<p class="text-xl mb-8 max-w-3xl mx-auto">Smart section-based processing with full context retrieval</p>
|
| 24 |
+
<div class="flex justify-center gap-4">
|
| 25 |
+
<a href="#overview" class="bg-white text-blue-700 px-6 py-3 rounded-lg font-medium hover:bg-blue-50 transition">Learn More</a>
|
| 26 |
+
<a href="#demo" class="border-2 border-white text-white px-6 py-3 rounded-lg font-medium hover:bg-white hover:text-blue-700 transition">See Demo</a>
|
| 27 |
+
</div>
|
| 28 |
+
</div>
|
| 29 |
+
</section>
|
| 30 |
+
|
| 31 |
+
<!-- System Overview -->
|
| 32 |
+
<section id="overview" class="mb-16">
|
| 33 |
+
<h2 class="text-3xl font-bold mb-8 text-center dark:text-white">System Architecture</h2>
|
| 34 |
+
|
| 35 |
+
<div class="relative">
|
| 36 |
+
<!-- Architecture SVG -->
|
| 37 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow-lg">
|
| 38 |
+
<img src="architecture.svg" alt="System Architecture Diagram" class="w-full h-auto rounded-lg">
|
| 39 |
+
</div>
|
| 40 |
+
|
| 41 |
+
<!-- Key Features -->
|
| 42 |
+
<div class="grid md:grid-cols-2 lg:grid-cols-3 gap-6 mt-8">
|
| 43 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow hover:shadow-lg transition">
|
| 44 |
+
<div class="bg-blue-100 dark:bg-blue-900 w-12 h-12 rounded-full flex items-center justify-center mb-4">
|
| 45 |
+
<i data-feather="layers" class="text-blue-600 dark:text-blue-300"></i>
|
| 46 |
+
</div>
|
| 47 |
+
<h3 class="text-xl font-semibold mb-2 dark:text-white">Section-Based Processing</h3>
|
| 48 |
+
<p class="text-gray-600 dark:text-gray-300">Documents are intelligently split into meaningful sections while preserving hierarchy.</p>
|
| 49 |
+
</div>
|
| 50 |
+
|
| 51 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow hover:shadow-lg transition">
|
| 52 |
+
<div class="bg-purple-100 dark:bg-purple-900 w-12 h-12 rounded-full flex items-center justify-center mb-4">
|
| 53 |
+
<i data-feather="cpu" class="text-purple-600 dark:text-purple-300"></i>
|
| 54 |
+
</div>
|
| 55 |
+
<h3 class="text-xl font-semibold mb-2 dark:text-white">Context Recovery</h3>
|
| 56 |
+
<p class="text-gray-600 dark:text-gray-300">Retrieve the full original section content even when searching small chunks.</p>
|
| 57 |
+
</div>
|
| 58 |
+
|
| 59 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow hover:shadow-lg transition">
|
| 60 |
+
<div class="bg-green-100 dark:bg-green-900 w-12 h-12 rounded-full flex items-center justify-center mb-4">
|
| 61 |
+
<i data-feather="zap" class="text-green-600 dark:text-green-300"></i>
|
| 62 |
+
</div>
|
| 63 |
+
<h3 class="text-xl font-semibold mb-2 dark:text-white">Hybrid Search</h3>
|
| 64 |
+
<p class="text-gray-600 dark:text-gray-300">Combines vector similarity and BM25 for precise and relevant results.</p>
|
| 65 |
+
</div>
|
| 66 |
+
</div>
|
| 67 |
+
</div>
|
| 68 |
+
</section>
|
| 69 |
+
|
| 70 |
+
<!-- Workflow Demo -->
|
| 71 |
+
<section id="demo" class="mb-16">
|
| 72 |
+
<h2 class="text-3xl font-bold mb-8 text-center dark:text-white">Workflow Demonstration</h2>
|
| 73 |
+
|
| 74 |
+
<div class="bg-white dark:bg-gray-800 rounded-xl shadow-lg overflow-hidden">
|
| 75 |
+
<div class="grid md:grid-cols-2 gap-0">
|
| 76 |
+
<!-- Step Navigation -->
|
| 77 |
+
<div class="bg-gray-50 dark:bg-gray-700 p-6">
|
| 78 |
+
<div class="sticky top-6">
|
| 79 |
+
<h3 class="text-xl font-semibold mb-4 dark:text-white">Process Steps</h3>
|
| 80 |
+
<div class="space-y-2">
|
| 81 |
+
<button class="workflow-step active" data-step="1">1. Document Upload</button>
|
| 82 |
+
<button class="workflow-step" data-step="2">2. Section Splitting</button>
|
| 83 |
+
<button class="workflow-step" data-step="3">3. Chunk Generation</button>
|
| 84 |
+
<button class="workflow-step" data-step="4">4. Storage</button>
|
| 85 |
+
<button class="workflow-step" data-step="5">5. Search Process</button>
|
| 86 |
+
</div>
|
| 87 |
+
</div>
|
| 88 |
+
</div>
|
| 89 |
+
|
| 90 |
+
<!-- Step Content -->
|
| 91 |
+
<div class="p-6">
|
| 92 |
+
<div class="workflow-content" data-step-content="1">
|
| 93 |
+
<h3 class="text-xl font-semibold mb-4 dark:text-white">Document Upload</h3>
|
| 94 |
+
<p class="mb-4 text-gray-600 dark:text-gray-300">Users upload documents through the API endpoint:</p>
|
| 95 |
+
<pre class="bg-gray-100 dark:bg-gray-900 rounded-lg p-4 mb-4 overflow-x-auto">
|
| 96 |
+
<code class="text-sm">
|
| 97 |
+
POST /index-file/
|
| 98 |
+
{
|
| 99 |
+
"list_path": [
|
| 100 |
+
{
|
| 101 |
+
"id": "uuid-file-1",
|
| 102 |
+
"path": "extracted/document.md"
|
| 103 |
+
}
|
| 104 |
+
]
|
| 105 |
+
}
|
| 106 |
+
</code>
|
| 107 |
+
</pre>
|
| 108 |
+
<p class="text-gray-600 dark:text-gray-300">The system creates a unique job ID for processing and returns immediately.</p>
|
| 109 |
+
</div>
|
| 110 |
+
|
| 111 |
+
<div class="workflow-content hidden" data-step-content="2">
|
| 112 |
+
<h3 class="text-xl font-semibold mb-4 dark:text-white">Section Splitting</h3>
|
| 113 |
+
<p class="mb-4 text-gray-600 dark:text-gray-300">Documents are split into hierarchical sections:</p>
|
| 114 |
+
<div class="bg-gray-100 dark:bg-gray-900 rounded-lg p-4 mb-4">
|
| 115 |
+
<div class="flex items-center mb-2">
|
| 116 |
+
<div class="w-2 h-2 bg-blue-500 rounded-full mr-2"></div>
|
| 117 |
+
<span class="font-mono text-sm"># Main Title</span>
|
| 118 |
+
</div>
|
| 119 |
+
<div class="flex items-center mb-2 ml-4">
|
| 120 |
+
<div class="w-2 h-2 bg-purple-500 rounded-full mr-2"></div>
|
| 121 |
+
<span class="font-mono text-sm">## Section 1</span>
|
| 122 |
+
</div>
|
| 123 |
+
<div class="flex items-center mb-2 ml-8">
|
| 124 |
+
<div class="w-2 h-2 bg-green-500 rounded-full mr-2"></div>
|
| 125 |
+
<span class="font-mono text-sm">### Subsection 1.1</span>
|
| 126 |
+
</div>
|
| 127 |
+
<div class="flex items-center ml-4">
|
| 128 |
+
<div class="w-2 h-2 bg-purple-500 rounded-full mr-2"></div>
|
| 129 |
+
<span class="font-mono text-sm">## Section 2</span>
|
| 130 |
+
</div>
|
| 131 |
+
</div>
|
| 132 |
+
<p class="text-gray-600 dark:text-gray-300">Each section maintains its page information and document context.</p>
|
| 133 |
+
</div>
|
| 134 |
+
|
| 135 |
+
<div class="workflow-content hidden" data-step-content="3">
|
| 136 |
+
<h3 class="text-xl font-semibold mb-4 dark:text-white">Chunk Generation</h3>
|
| 137 |
+
<p class="mb-4 text-gray-600 dark:text-gray-300">Large sections are split into optimally-sized chunks:</p>
|
| 138 |
+
<div class="flex items-start mb-4">
|
| 139 |
+
<div class="border-r-2 border-gray-300 dark:border-gray-600 pr-4 mr-4">
|
| 140 |
+
<div class="bg-blue-100 dark:bg-blue-900 px-3 py-2 rounded-lg mb-2">
|
| 141 |
+
<p class="text-sm">Section ID: 123e4567</p>
|
| 142 |
+
</div>
|
| 143 |
+
<div class="space-y-2">
|
| 144 |
+
<div class="bg-gray-200 dark:bg-gray-700 px-3 py-2 rounded-lg">
|
| 145 |
+
<p class="text-sm">Chunk 1 (Tokens: 512)</p>
|
| 146 |
+
</div>
|
| 147 |
+
<div class="bg-gray-200 dark:bg-gray-700 px-3 py-2 rounded-lg">
|
| 148 |
+
<p class="text-sm">Chunk 2 (Tokens: 498)</p>
|
| 149 |
+
</div>
|
| 150 |
+
</div>
|
| 151 |
+
</div>
|
| 152 |
+
<div>
|
| 153 |
+
<p class="text-gray-600 dark:text-gray-300 text-sm">Each chunk references its parent section while being small enough for efficient vector search.</p>
|
| 154 |
+
</div>
|
| 155 |
+
</div>
|
| 156 |
+
</div>
|
| 157 |
+
|
| 158 |
+
<div class="workflow-content hidden" data-step-content="4">
|
| 159 |
+
<h3 class="text-xl font-semibold mb-4 dark:text-white">Storage</h3>
|
| 160 |
+
<div class="grid grid-cols-1 md:grid-cols-2 gap-4 mb-4">
|
| 161 |
+
<div class="bg-gray-100 dark:bg-gray-900 p-4 rounded-lg">
|
| 162 |
+
<div class="flex items-center mb-2">
|
| 163 |
+
<i data-feather="database" class="mr-2 text-blue-500"></i>
|
| 164 |
+
<h4 class="font-medium">PostgreSQL</h4>
|
| 165 |
+
</div>
|
| 166 |
+
<p class="text-sm text-gray-600 dark:text-gray-300">Stores complete section content with metadata</p>
|
| 167 |
+
</div>
|
| 168 |
+
<div class="bg-gray-100 dark:bg-gray-900 p-4 rounded-lg">
|
| 169 |
+
<div class="flex items-center mb-2">
|
| 170 |
+
<i data-feather="box" class="mr-2 text-purple-500"></i>
|
| 171 |
+
<h4 class="font-medium">Milvus</h4>
|
| 172 |
+
</div>
|
| 173 |
+
<p class="text-sm text-gray-600 dark:text-gray-300">Stores vector embeddings of chunks with section references</p>
|
| 174 |
+
</div>
|
| 175 |
+
</div>
|
| 176 |
+
</div>
|
| 177 |
+
|
| 178 |
+
<div class="workflow-content hidden" data-step-content="5">
|
| 179 |
+
<h3 class="text-xl font-semibold mb-4 dark:text-white">Search Process</h3>
|
| 180 |
+
<ol class="space-y-4">
|
| 181 |
+
<li class="flex items-start">
|
| 182 |
+
<div class="bg-blue-500 text-white rounded-full w-6 h-6 flex items-center justify-center mr-3 mt-0.5 flex-shrink-0">1</div>
|
| 183 |
+
<div>
|
| 184 |
+
<p class="font-medium dark:text-white">Query Processing</p>
|
| 185 |
+
<p class="text-sm text-gray-600 dark:text-gray-300">Convert search query to embeddings and search Milvus</p>
|
| 186 |
+
</div>
|
| 187 |
+
</li>
|
| 188 |
+
<li class="flex items-start">
|
| 189 |
+
<div class="bg-blue-500 text-white rounded-full w-6 h-6 flex items-center justify-center mr-3 mt-0.5 flex-shrink-0">2</div>
|
| 190 |
+
<div>
|
| 191 |
+
<p class="font-medium dark:text-white">Result Aggregation</p>
|
| 192 |
+
<p class="text-sm text-gray-600 dark:text-gray-300">Group chunks by their section_id and score</p>
|
| 193 |
+
</div>
|
| 194 |
+
</li>
|
| 195 |
+
<li class="flex items-start">
|
| 196 |
+
<div class="bg-blue-500 text-white rounded-full w-6 h-6 flex items-center justify-center mr-3 mt-0.5 flex-shrink-0">3</div>
|
| 197 |
+
<div>
|
| 198 |
+
<p class="font-medium dark:text-white">Context Retrieval</p>
|
| 199 |
+
<p class="text-sm text-gray-600 dark:text-gray-300">Fetch full section content from PostgreSQL</p>
|
| 200 |
+
</div>
|
| 201 |
+
</li>
|
| 202 |
+
</ol>
|
| 203 |
+
</div>
|
| 204 |
+
</div>
|
| 205 |
+
</div>
|
| 206 |
+
</div>
|
| 207 |
+
</section>
|
| 208 |
+
|
| 209 |
+
<!-- Benefits Section -->
|
| 210 |
+
<section class="mb-16">
|
| 211 |
+
<h2 class="text-3xl font-bold mb-8 text-center dark:text-white">System Benefits</h2>
|
| 212 |
+
|
| 213 |
+
<div class="grid md:grid-cols-3 gap-6">
|
| 214 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow-lg border-t-4 border-blue-500">
|
| 215 |
+
<h3 class="text-xl font-semibold mb-3 dark:text-white">Full Context Recovery</h3>
|
| 216 |
+
<p class="text-gray-600 dark:text-gray-300">Retrieve complete section content even when searching small chunks, ensuring no context is lost.</p>
|
| 217 |
+
</div>
|
| 218 |
+
|
| 219 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow-lg border-t-4 border-purple-500">
|
| 220 |
+
<h3 class="text-xl font-semibold mb-3 dark:text-white">Hybrid Search</h3>
|
| 221 |
+
<p class="text-gray-600 dark:text-gray-300">Combine semantic vector search with keyword-based BM25 for more accurate results.</p>
|
| 222 |
+
</div>
|
| 223 |
+
|
| 224 |
+
<div class="bg-white dark:bg-gray-800 p-6 rounded-xl shadow-lg border-t-4 border-green-500">
|
| 225 |
+
<h3 class="text-xl font-semibold mb-3 dark:text-white">Optimal Performance</h3>
|
| 226 |
+
<p class="text-gray-600 dark:text-gray-300">Fast vector search on small chunks with instant retrieval of full sections from PostgreSQL.</p>
|
| 227 |
+
</div>
|
| 228 |
+
</div>
|
| 229 |
+
</section>
|
| 230 |
+
</div>
|
| 231 |
+
</main>
|
| 232 |
+
|
| 233 |
+
<custom-footer></custom-footer>
|
| 234 |
+
|
| 235 |
+
<script>
|
| 236 |
+
feather.replace();
|
| 237 |
+
|
| 238 |
+
// Workflow step navigation
|
| 239 |
+
document.addEventListener('DOMContentLoaded', function() {
|
| 240 |
+
const steps = document.querySelectorAll('.workflow-step');
|
| 241 |
+
const contents = document.querySelectorAll('.workflow-content');
|
| 242 |
+
|
| 243 |
+
steps.forEach(step => {
|
| 244 |
+
step.addEventListener('click', function() {
|
| 245 |
+
// Remove active class from all steps
|
| 246 |
+
steps.forEach(s => s.classList.remove('active', 'bg-blue-500', 'text-white'));
|
| 247 |
+
|
| 248 |
+
// Add active class to clicked step
|
| 249 |
+
this.classList.add('active', 'bg-blue-500', 'text-white');
|
| 250 |
+
|
| 251 |
+
// Hide all content
|
| 252 |
+
contents.forEach(c => c.classList.add('hidden'));
|
| 253 |
+
|
| 254 |
+
// Show corresponding content
|
| 255 |
+
const stepNum = this.getAttribute('data-step');
|
| 256 |
+
document.querySelector(`[data-step-content="${stepNum}"]`).classList.remove('hidden');
|
| 257 |
+
});
|
| 258 |
+
});
|
| 259 |
+
|
| 260 |
+
// Activate first step by default
|
| 261 |
+
document.querySelector('.workflow-step').click();
|
| 262 |
+
});
|
| 263 |
+
</script>
|
| 264 |
+
<script src="https://huggingface.co/deepsite/deepsite-badge.js"></script>
|
| 265 |
+
</body>
|
| 266 |
+
</html>
|
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
// Theme switcher functionality
|
| 2 |
+
document.addEventListener('DOMContentLoaded', function() {
|
| 3 |
+
// Check for saved theme preference or use system preference
|
| 4 |
+
const prefersDark = window.matchMedia('(prefers-color-scheme: dark)').matches;
|
| 5 |
+
const savedTheme = localStorage.getItem('theme');
|
| 6 |
+
|
| 7 |
+
if (savedTheme === 'dark' || (!savedTheme && prefersDark)) {
|
| 8 |
+
document.documentElement.classList.add('dark');
|
| 9 |
+
}
|
| 10 |
+
|
| 11 |
+
// Smooth scrolling for anchor links
|
| 12 |
+
document.querySelectorAll('a[href^="#"]').forEach(anchor => {
|
| 13 |
+
anchor.addEventListener('click', function(e) {
|
| 14 |
+
e.preventDefault();
|
| 15 |
+
|
| 16 |
+
const targetId = this.getAttribute('href');
|
| 17 |
+
if (targetId === '#') return;
|
| 18 |
+
|
| 19 |
+
const targetElement = document.querySelector(targetId);
|
| 20 |
+
if (targetElement) {
|
| 21 |
+
targetElement.scrollIntoView({
|
| 22 |
+
behavior: 'smooth',
|
| 23 |
+
block: 'start'
|
| 24 |
+
});
|
| 25 |
+
}
|
| 26 |
+
});
|
| 27 |
+
});
|
| 28 |
+
});
|
| 29 |
+
|
| 30 |
+
// Function to toggle dark mode
|
| 31 |
+
function toggleDarkMode() {
|
| 32 |
+
const html = document.documentElement;
|
| 33 |
+
html.classList.toggle('dark');
|
| 34 |
+
|
| 35 |
+
// Save preference to localStorage
|
| 36 |
+
const isDark = html.classList.contains('dark');
|
| 37 |
+
localStorage.setItem('theme', isDark ? 'dark' : 'light');
|
| 38 |
+
}
|
|
@@ -1,28 +1,42 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
}
|
| 5 |
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
|
|
|
|
|
|
| 9 |
}
|
| 10 |
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
font-size: 15px;
|
| 14 |
-
margin-bottom: 10px;
|
| 15 |
-
margin-top: 5px;
|
| 16 |
}
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
border: 1px solid lightgray;
|
| 23 |
-
border-radius: 16px;
|
| 24 |
}
|
| 25 |
|
| 26 |
-
|
| 27 |
-
|
| 28 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
@tailwind base;
|
| 2 |
+
@tailwind components;
|
| 3 |
+
@tailwind utilities;
|
| 4 |
+
|
| 5 |
+
:root {
|
| 6 |
+
--primary: #3b82f6; /* blue-500 */
|
| 7 |
+
--secondary: #8b5cf6; /* purple-500 */
|
| 8 |
+
}
|
| 9 |
+
|
| 10 |
+
.dark {
|
| 11 |
+
--primary: #60a5fa; /* blue-400 */
|
| 12 |
+
--secondary: #a78bfa; /* purple-400 */
|
| 13 |
}
|
| 14 |
|
| 15 |
+
/* Workflow Step Styling */
|
| 16 |
+
.workflow-step {
|
| 17 |
+
@apply w-full text-left px-4 py-2 rounded-lg transition;
|
| 18 |
+
@apply bg-gray-100 dark:bg-gray-600 hover:bg-gray-200 dark:hover:bg-gray-500;
|
| 19 |
+
@apply text-gray-700 dark:text-gray-200 font-medium;
|
| 20 |
}
|
| 21 |
|
| 22 |
+
.workflow-step.active {
|
| 23 |
+
@apply bg-blue-500 text-white;
|
|
|
|
|
|
|
|
|
|
| 24 |
}
|
| 25 |
|
| 26 |
+
/* Custom scrollbar */
|
| 27 |
+
::-webkit-scrollbar {
|
| 28 |
+
width: 8px;
|
| 29 |
+
height: 8px;
|
|
|
|
|
|
|
| 30 |
}
|
| 31 |
|
| 32 |
+
::-webkit-scrollbar-track {
|
| 33 |
+
@apply bg-gray-100 dark:bg-gray-800;
|
| 34 |
}
|
| 35 |
+
|
| 36 |
+
::-webkit-scrollbar-thumb {
|
| 37 |
+
@apply bg-gray-400 dark:bg-gray-600 rounded-full;
|
| 38 |
+
}
|
| 39 |
+
|
| 40 |
+
::-webkit-scrollbar-thumb:hover {
|
| 41 |
+
@apply bg-gray-500 dark:bg-gray-500;
|
| 42 |
+
}
|