import streamlit as st from pathlib import Path from layout import gray_container, tool_container, key_concept, research_question def render(): """Module 3: OCR Technology and Historical Documents""" st.title("Module 3: OCR Technology and Historical Documents") col1, col2 = st.columns([1, 1]) with col1: traditional_content = """

Traditional OCR Approaches

Pattern Matching: Early OCR compared characters to templates
Feature Extraction: Identifying key features of characters
Statistical Models: Using probabilities to improve recognition

""" gray_container(traditional_content) modern_content = """

Modern AI-Enhanced OCR

Neural Networks: Deep learning models trained on vast datasets
Computer Vision: Advanced image processing techniques
Language Models: Contextual understanding to resolve ambiguities
Multimodal Models: Integration of text, layout, and visual understanding

""" gray_container(modern_content) with col2: challenges_content = """

Challenges with Historical Documents

Historical materials present unique difficulties:

Typography Variation: Non-standardized fonts and styles
Historical Language: Archaic vocabulary and grammar
Layout Complexity: Non-linear arrangements
Document Degradation: Fading, tears, stains, and damage
Material Artifacts: Paper texture, binding shadows, etc.

""" gray_container(challenges_content) # Display OCR processing diagram st.image("https://cdn.dribbble.com/users/412119/screenshots/16353886/media/82e593c60a5e4d460db917236eab6ece.jpg", caption="OCR processing layers") # Key concept section concept_content = """

Vision-Enhanced OCR

Modern OCR systems like those based on Mistral-7B-Vision combine:

Image understanding capabilities to process the visual aspects
Text recognition to extract characters accurately
Layout analysis to understand structure
Contextual language processing for improved accuracy

This multimodal approach dramatically improves OCR results on historical documents compared to traditional OCR.

""" key_concept(concept_content) # Technical details in a tool container tech_content = """

Technical Evolution of OCR

Traditional OCR Pipeline:

Preprocessing (binarization, noise removal)
Layout analysis (segmentation)
Character recognition (pattern matching)
Post-processing (spell checking)

Modern LLM-Vision Pipeline:

Image normalization
Image embedding via vision encoder
Integration with language model
Joint inference across modalities
Structured extraction of information

""" tool_container(tech_content) # Research question research_content = """

Consider This:

How might the capabilities of vision-language models change our approach to digitizing historical archives?

""" research_question(research_content) # Display history if available if 'processing_history' in st.session_state and st.session_state.processing_history: with st.expander("Your OCR Processing History"): st.markdown("You've already processed the following documents:") for item in st.session_state.processing_history: st.markdown(f"**{item['fileName']}**") col1, col2 = st.columns(2) with col1: st.write(f"**Topics:** {', '.join(item['result'].get('topics', ['Unknown']))}") with col2: st.write(f"**Vision model used:** {'Yes' if item['useVision'] else 'No'}")