A newer version of the Streamlit SDK is available:
1.54.0
SPARKNET Demo Application
An interactive Streamlit demo showcasing SPARKNET's document intelligence capabilities.
Features
- π Document Processing: Upload and process documents with OCR
- π Field Extraction: Extract structured data with evidence grounding
- π¬ RAG Q&A: Interactive question answering with citations
- π·οΈ Classification: Automatic document type detection
- π Analytics: Processing statistics and insights
- π¬ Live Processing: Real-time pipeline visualization
- π Document Comparison: Compare multiple documents
Quick Start
1. Install Dependencies
# From project root
pip install -r demo/requirements.txt
# Or install all SPARKNET dependencies
pip install -r requirements.txt
2. Start Ollama (Optional, for live processing)
ollama serve
# Pull required models
ollama pull llama3.2:3b
ollama pull nomic-embed-text
3. Run the Demo
# From project root
streamlit run demo/app.py
# Or with custom port
streamlit run demo/app.py --server.port 8501
4. Open in Browser
Navigate to http://localhost:8501
Demo Pages
| Page | Description |
|---|---|
| Home | Overview and feature cards |
| Document Processing | Upload/select documents for OCR processing |
| Field Extraction | Extract structured fields with evidence |
| RAG Q&A | Ask questions about indexed documents |
| Classification | Classify document types |
| Analytics | View processing statistics |
| Live Processing | Watch pipeline in real-time |
| Interactive RAG | Chat-style document Q&A |
| Document Comparison | Compare documents side by side |
Sample Documents
The demo uses patent pledge documents from the Dataset/ folder:
- Apple 11.11.2011.pdf
- IBM 11.01.2005.pdf
- Google 08.02.2012.pdf
- And more...
Screenshots
Home Page
βββββββββββββββββββββββββββββββββββββββββββ
β π₯ SPARKNET β
β Agentic Document Intelligence Platform β
βββββββββββββββββββββββββββββββββββββββββββ€
β [Doc Processing] [Extraction] [RAG] β
β β
β Feature cards with gradients... β
βββββββββββββββββββββββββββββββββββββββββββ
RAG Q&A
βββββββββββββββββββββββββββββββββββββββββββ
β π¬ Ask a question... β
βββββββββββββββββββββββββββββββββββββββββββ€
β User: What patents are covered? β
β β
β Assistant: Based on the documents... β
β [π View Sources] β
β [1] Apple - Page 1: "..." β
β [2] IBM - Page 2: "..." β
βββββββββββββββββββββββββββββββββββββββββββ
Configuration
Environment Variables
# Ollama URL (default: http://localhost:11434)
export OLLAMA_BASE_URL=http://localhost:11434
# ChromaDB path (default: ./data/vectorstore)
export CHROMA_PERSIST_DIR=./data/vectorstore
Streamlit Config
Create .streamlit/config.toml:
[theme]
primaryColor = "#FF6B6B"
backgroundColor = "#FFFFFF"
secondaryBackgroundColor = "#F0F2F6"
textColor = "#262730"
[server]
maxUploadSize = 50
Development
Adding New Pages
Create a new file in
demo/pages/:demo/pages/4_π_New_Feature.pyFollow the naming convention:
{order}_{emoji}_{name}.pyImport project modules:
import sys from pathlib import Path PROJECT_ROOT = Path(__file__).parent.parent.parent sys.path.insert(0, str(PROJECT_ROOT))
Customizing Styles
Edit the CSS in app.py:
st.markdown("""
<style>
.main-header { ... }
.evidence-box { ... }
</style>
""", unsafe_allow_html=True)
Troubleshooting
"ModuleNotFoundError: No module named 'src'"
Make sure you're running from the project root:
cd /path/to/SPARKNET
streamlit run demo/app.py
Ollama Not Connected
- Check if Ollama is running:
curl http://localhost:11434/api/tags - Start Ollama:
ollama serve
ChromaDB Errors
Install ChromaDB:
pip install chromadb
License
Part of the SPARKNET project. See main LICENSE file.