garywelz commited on
Commit
29c97e7
·
verified ·
1 Parent(s): 9a02805

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +120 -537
  2. index.html +238 -876
README.md CHANGED
@@ -1,566 +1,149 @@
1
  ---
2
- title: CopernicusAI - Research-Driven Podcast Generation Platform
3
- emoji: 🔬
4
- colorFrom: purple
5
  colorTo: blue
6
  sdk: static
7
- pinned: false
8
  license: mit
9
  ---
10
 
11
- # 🔬 CopernicusAI - Knowledge Engine for Scientific Discovery
12
 
13
- A collaborative research platform that transforms cutting-edge scientific research into accessible, multi-format tools for collective knowledge exploration. These are research instruments—like microscopes for observing the collective knowledge of humanity—enabling hypothesis formation, testing, and discovery across scientific disciplines.
14
-
15
- ## Summary
16
-
17
- **CopernicusAI** is an operational research platform that synthesizes scientific literature from 250+ million papers into AI-generated podcasts, integrates with a knowledge graph of 12,000+ indexed papers, and provides collaborative tools for research discovery. The system demonstrates production-ready multi-source research synthesis with full citation tracking and evidence-based content generation requiring minimum 3 research sources per episode.
18
-
19
- The platform includes a fully operational Research Tools Dashboard (deployed December 2025) with interactive knowledge graph visualization, vector search, and RAG capabilities, enabling researchers to explore, query, and synthesize scientific knowledge across disciplines.
20
-
21
- ## Prior Work: CopernicusAI Research Interface
22
-
23
- CopernicusAI is an active research prototype exploring AI-generated audio briefings as an interface for assisted scientific research.
24
-
25
- The system allows any user to generate, refine, and share AI-generated science podcasts based on structured prompts, enabling rapid orientation to a topic, iterative deepening, and personalized research briefings.
26
-
27
- Rather than functioning as a static content platform, CopernicusAI supports collectively generated and shared research artifacts, analogous to community-driven knowledge platforms (e.g., discussion forums), but grounded in scientific sources and metadata-aware workflows.
28
-
29
- This work demonstrates technical feasibility for:
30
- - AI-assisted research briefing and orientation
31
- - Iterative question refinement via conversational interfaces
32
- - Integration of text, audio, and metadata in research workflows
33
-
34
- ### Current Implementation (December 2025)
35
-
36
- The Research Tools Dashboard is **fully operational** and deployed to Google Cloud Run, providing unified access to all components with interactive knowledge graph visualization, vector search, RAG queries, and content browsing.
37
-
38
- ## 🎯 Mission & Vision
39
-
40
- Inspired by Nicolaus Copernicus who challenged accepted knowledge with evidence and rigorous analysis, **CopernicusAI** creates collaborative research tools that enable collective participation in scientific discovery. These platforms are instruments for exploring humanity's collective knowledge—tools for hypothesis formation, testing, and collaborative research, not just educational content.
41
-
42
- Just as a microscope enables observation of the microscopic world, CopernicusAI tools enable observation and exploration of humanity's collective knowledge. Subscribers collaborate to prompt, generate, and refine research content—sharing discoveries publicly or keeping them private. As large language models (LLMs) and AI systems gain unprecedented knowledge, CopernicusAI provides the infrastructure for human-AI collaborative knowledge exploration, with evidence-based truth-seeking as our guiding principle.
43
-
44
- ---
45
-
46
- ## 🌟 Core Platform Capabilities
47
-
48
- ### 🎙️ AI-Powered Podcast Generation
49
-
50
- **Production-Ready System:**
51
- - Collaborative platform where subscribers prompt and generate multi-voice AI podcasts (5-10 minutes) synthesizing research from multiple academic sources
52
- - Subscribers can share their podcasts publicly or keep them private
53
- - Evidence-based content generation requiring minimum 3 research sources per episode
54
- - Comprehensive research integration across 8+ academic databases
55
- - **64 episodes** generated across Biology, Chemistry, Computer Science, Mathematics, and Physics
56
- - Automated audio synthesis with professional multi-speaker dialogue
57
- - AI-generated episode thumbnails with scientific visualizations
58
- - RSS feed distribution compatible with Spotify, Apple Podcasts, Google Podcasts
59
-
60
- **Research Integration:**
61
- - Real-time discovery from PubMed, arXiv, NASA ADS, Zenodo, bioRxiv, CORE, Google Scholar, and News APIs
62
- - Parallel search across multiple databases for comprehensive coverage
63
- - Quality scoring and relevance ranking of research sources
64
- - Paradigm shift identification and interdisciplinary connection analysis
65
- - Automatic citation extraction and formatting
66
- - Source validation and authenticity verification
67
-
68
- ### 🤖 Advanced LLM Integration
69
-
70
- **Multi-Model Architecture:**
71
- - **Google Gemini 3** - Latest research analysis and content generation
72
- - **OpenAI GPT-4/GPT-3.5** - Content synthesis and quality validation
73
- - **Anthropic Claude 3** (Sonnet, Haiku via OpenRouter) - Alternative reasoning paths
74
- - **ElevenLabs TTS** - Multi-voice text-to-speech synthesis
75
- - Model selection based on task complexity and expertise level
76
- - Fallback chains for reliability and cost optimization
77
-
78
- **Capabilities:**
79
- - Multi-paper analysis and synthesis
80
- - Paradigm shift detection in research domains
81
- - Interdisciplinary connection identification
82
- - Entity extraction (genes, proteins, chemical compounds, mathematical concepts)
83
- - Citation tracking and cross-reference analysis
84
- - Content quality scoring and validation
85
-
86
- ### 📊 Research Resource Access
87
-
88
- **Comprehensive Academic Database Coverage:**
89
-
90
- Our research pipeline integrates with **8+ major academic databases**, providing access to:
91
-
92
- - **PubMed/NCBI** (~30+ million biomedical papers)
93
- - **arXiv** (~2+ million preprints in physics, mathematics, CS, quantitative biology)
94
- - **NASA ADS** (~15+ million astronomy/astrophysics papers)
95
- - **Zenodo** (100K+ open science datasets and publications)
96
- - **bioRxiv/medRxiv** (preprints in life sciences)
97
- - **CORE** (~200+ million open access papers)
98
- - **Google Scholar** (comprehensive academic search)
99
- - **News API** (current events and trending research topics)
100
- - **YouTube Data API** (academic videos, conference talks, lectures)
101
-
102
- **Total Access:** **250+ million research papers and academic resources** across all major scientific disciplines.
103
-
104
- ### 🎙️ Audio and Video Podcast Production
105
-
106
- **Operating Audio Podcast System:**
107
- Full production and distribution platform for subscriber-generated podcasts. Users can prompt, generate, publish, and distribute audio podcasts with RSS feed support for Spotify, Apple Podcasts, and Google Podcasts.
108
-
109
- - Multi-voice AI podcast generation
110
- - Research-driven content creation
111
- - RSS feed distribution
112
- - Public and private podcast options
113
- - Professional audio quality
114
-
115
- **Video Production (Future - Phase 2+):**
116
-
117
- Advanced video features planned for future development:
118
-
119
- **Planned Advanced Features (Phase 2-4):**
120
- - **Visual Content Integration:**
121
- - Automated extraction of figures and diagrams from research papers
122
- - Screen capture and processing of academic illustrations
123
- - Web scraping from scientific journal websites and preprint servers
124
- - JSON database integration for structured visual data
125
-
126
- - **Dynamic Visualization Generation:**
127
- - On-the-fly scientific animations (molecular structures, data flows, algorithms)
128
- - Real-time chart and graph generation from research data
129
- - Python-based animations using matplotlib, plotly, mayavi
130
- - Mathematical formula rendering (LaTeX → video)
131
-
132
- - **External Video Quoting:**
133
- - YouTube video segment extraction and integration
134
- - Time-stamped video quoting with proper attribution
135
- - Educational fair use compliance
136
- - Source video discovery during research phase
137
-
138
- - **Advanced Composition:**
139
- - Multi-layer video composition (background, content, overlays, effects)
140
- - Automatic subtitle generation from transcripts
141
- - Text overlay system (key concepts, citations, speaker identification)
142
- - Professional transitions and effects
143
- - Audio-visual synchronization
144
-
145
- **See:** [Science Video Database](https://huggingface.co/spaces/garywelz/sciencevideodb) - Companion project for research video content management.
146
-
147
- ### 📚 Research Papers Metadata Database (Phase 2)
148
-
149
- **Planned Implementation:**
150
- A centralized **metadata repository** (not a file archive) that provides:
151
-
152
- - **Structured JSON Objects:** Research paper metadata including:
153
- - DOI, arXiv ID, publication information
154
- - Abstracts and key findings
155
- - Extracted entities (genes, proteins, chemical compounds, equations)
156
- - Citation networks and cross-references
157
- - Paradigm shift indicators
158
- - Interdisciplinary connections
159
- - Quality scores and relevance metrics
160
-
161
- - **AI-Powered Preprocessing:**
162
- - LLM-based entity extraction and annotation
163
- - Automatic categorization by discipline and subdomain
164
- - Keyword extraction and semantic tagging
165
- - Citation tracking and relationship mapping
166
- - Quality assessment and validation
167
-
168
- - **Integration Features:**
169
- - DOI/arXiv ID resolution and metadata enrichment
170
- - Cross-reference linking between papers
171
- - Podcast-to-paper relationship tracking
172
- - Search and query capabilities
173
- - API access for programmatic retrieval
174
-
175
- **Technical Architecture:**
176
- - Firestore NoSQL database for flexible JSON storage
177
- - Google Cloud Functions for automated metadata processing
178
- - Vertex AI for entity extraction and analysis
179
- - RESTful API for external access
180
-
181
- **Benefits:**
182
- - Enables rapid research discovery across podcasts
183
- - Supports knowledge graph construction
184
- - Facilitates cross-disciplinary pattern recognition
185
- - Provides foundation for semantic search capabilities
186
-
187
- ---
188
-
189
- ## 🗄️ System Architecture
190
-
191
- ### Database Structure (Firestore)
192
-
193
- **Collections:**
194
- - **`subscribers`** - User accounts, preferences, subscription tiers, usage analytics
195
- - **`podcast_jobs`** - Generated podcasts with full metadata, source papers, engagement metrics
196
- - **`episodes`** - Published episodes with RSS distribution status
197
- - **`research_papers`** (Phase 2) - Paper metadata database with AI-extracted entities
198
-
199
- ### Storage Structure (Google Cloud Storage)
200
-
201
- - **`audio/`** - MP3 podcast files (multi-voice ElevenLabs synthesis)
202
- - **`videos/`** - MP4 video podcasts (current and future)
203
- - **`transcripts/`** - Full text transcripts with speaker markers
204
- - **`descriptions/`** - Markdown descriptions with academic references
205
- - **`thumbnails/`** - AI-generated episode artwork (DALL-E 3)
206
- - **`video-assets/`** - Extracted figures, animations, visual content
207
- - **`glmp-v2/`** - Genome Logic Modeling Project flowcharts (JSON)
208
-
209
- ### Backend Services (Google Cloud Run)
210
-
211
- **Microservices Architecture:**
212
- - **Podcast Generation Service** - Orchestrates research, content generation, and media production
213
- - **Research Pipeline Service** - Multi-API academic search and analysis
214
- - **Video Generation Service** - Video composition and encoding (Phase 1 complete)
215
- - **RSS Service** - Feed generation and distribution
216
- - **Episode Service** - Catalog management and metadata
217
-
218
- ---
219
-
220
- ## ⚙️ Technology Stack
221
-
222
- ### AI & Machine Learning
223
- - **Google Gemini 3** - Latest LLM for research analysis
224
- - **Google Vertex AI** - Enterprise-scale model deployment and orchestration (used throughout platform)
225
- - **OpenAI GPT-4/GPT-3.5** - Content synthesis and validation
226
- - **Anthropic Claude 3** - Alternative reasoning via OpenRouter
227
- - **ElevenLabs TTS** - Multi-voice text-to-speech synthesis
228
- - **DALL-E 3** - AI-generated scientific visualizations
229
- - **Google Cloud Vision API** - Image analysis and quality assessment
230
- - **Video Intelligence API** - Scene detection and content analysis
231
-
232
- ### Backend Infrastructure
233
- - **FastAPI** (Python) - RESTful API framework
234
- - **Google Cloud Run** - Serverless container deployment
235
- - **Firestore** - NoSQL document database
236
- - **Cloud Storage** - Media file storage and CDN
237
- - **Cloud Functions** - Event-driven processing
238
- - **Cloud Tasks** - Background job queuing
239
- - **Secret Manager** - API key and credential management
240
-
241
- ### Media Processing
242
- - **FFmpeg** - Video encoding and composition
243
- - **MoviePy** - Python video editing (planned)
244
- - **Matplotlib/Plotly** - Scientific visualization (planned)
245
- - **PyPDF2/pdfplumber** - PDF processing (planned)
246
-
247
- ### Frontend
248
- - **Next.js 15.5.7** - React framework
249
- - **Alpine.js** - Lightweight reactive UI
250
- - **Tailwind CSS** - Utility-first styling
251
- - **Vercel** - Frontend hosting and deployment
252
-
253
- ---
254
-
255
- ## 📈 Platform Capabilities
256
-
257
- ### Research Coverage
258
- - **250+ million research papers** accessible through integrated APIs
259
- - **8+ academic databases** integrated with parallel search
260
- - **Minimum 3 sources** required per episode for quality assurance
261
- - **Multi-paper analysis** for comprehensive coverage
262
-
263
- ### Platform Features
264
- - **Subscriber-driven content generation** - Users prompt and create podcasts
265
- - **RSS feed distribution** to major podcast platforms
266
- - **Public and private podcast options** - Share discoveries or keep them private
267
-
268
- ---
269
-
270
- ## 🔗 Live Platform & Resources
271
-
272
- ### Production Deployment
273
- - 🏠 **[Homepage - Browse Podcasts](https://www.copernicusai.fyi)** - Public podcast catalog
274
- - 📊 **[Creator Dashboard](https://www.copernicusai.fyi/subscriber-dashboard.html)** - Subscriber interface
275
- - 📡 **[RSS Feed](https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/feeds/copernicus-mvp-rss-feed.xml)** - Podcast distribution feed
276
-
277
- ## 🧩 CopernicusAI Knowledge Engine Components
278
-
279
- The CopernicusAI Knowledge Engine is an integrated ecosystem of research and collaboration tools. The Knowledge Engine is **fully implemented and operational** (December 2025), with a working system deployed to Google Cloud Run. Currently, the platform includes five core components, with additional tools, databases, and collaboration features planned for future development:
280
-
281
- ### 🎯 Knowledge Engine Implementation (December 2025)
282
-
283
- **Fully Operational System:**
284
- - **Public Project Interface:** https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html
285
- - **Research Tools Dashboard:** https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine
286
- - **Knowledge Graph:** Interactive visualization with 23,246 indexed papers, relationship extraction (citations, semantic similarity, categories), and graph query capabilities
287
- - **Vector Search:** Semantic search using Vertex AI embeddings across papers, podcasts, and processes
288
- - **RAG System:** Retrieval-augmented generation with citation support, context retrieval, and multi-modal content integration
289
- - **Unified Web Dashboard:** Production-ready interface with knowledge map visualization, search, RAG queries, content browsing, and statistics
290
- - **Architecture:** FastAPI backend, Next.js frontend, Firestore database, Vertex AI for embeddings and LLM capabilities, Model Context Protocol (MCP) server for AI assistant integration
291
- - **Deployment:** Fully deployed to Google Cloud Run, accessible 24/7
292
-
293
- ### Core Components
294
-
295
- 1. **🔬 CopernicusAI (This Platform)** - Core synthesis and distribution component
296
- - AI-powered research synthesis and podcast generation
297
- - Multi-API research integration (250+ million papers)
298
- - Subscriber-driven content creation and sharing
299
- - RSS feed distribution and platform management
300
-
301
- 2. **🛠️ Programming Framework** - Foundational meta-tool
302
- - Universal method for process analysis across any discipline
303
- - LLM-powered extraction and Mermaid visualization
304
- - Domain-agnostic methodology for complex process analysis
305
- - [Explore Framework →](https://huggingface.co/spaces/garywelz/programming_framework)
306
-
307
- 3. **🧬 GLMP - Genome Logic Modeling Project** - Specialized biological application
308
- - First application of Programming Framework to biology
309
- - 50+ biological processes visualized as interactive flowcharts
310
- - JSON-based structured data in Google Cloud Storage
311
- - [Explore GLMP →](https://huggingface.co/spaces/garywelz/glmp)
312
-
313
- 4. **📚 Research Paper Metadata Database** - Core data infrastructure
314
- - Centralized metadata repository for scientific research papers
315
- - AI-powered preprocessing and entity extraction
316
- - Citation network analysis and relationship mapping
317
- - Foundation for knowledge graph construction
318
- - [Explore Metadata Database →](https://huggingface.co/spaces/garywelz/metadata_database)
319
-
320
- 5. **🎬 Science Video Database** - Multi-modal content component
321
- - Curated searchable database of scientific video content
322
- - Transcript-based search across multiple disciplines
323
- - Integration with YouTube and other video sources
324
- - [Explore Video Database →](https://huggingface.co/spaces/garywelz/sciencevideodb)
325
- - [Live Demo →](https://scienceviddb-web-204731194849.us-central1.run.app/)
326
-
327
- ### Future Components
328
-
329
- The Knowledge Engine is designed to grow and evolve. Additional tools, databases, and collaboration components will be added as the project develops, expanding capabilities for AI-assisted scientific research and knowledge discovery.
330
-
331
- ---
332
-
333
- ## 🔌 API Documentation
334
-
335
- **Base URL:** `https://copernicus-podcast-api-phzp4ie2sq-uc.a.run.app`
336
-
337
- ### Podcast Generation Endpoints
338
- - `POST /generate-podcast-with-subscriber` - Generate new podcast from research topic
339
- - `GET /api/subscribers/podcasts/{id}` - Retrieve podcast details
340
- - `POST /api/subscribers/podcasts/submit-to-rss` - Publish to RSS feed
341
-
342
- ### Research Endpoints
343
- - `POST /api/papers/upload` - Upload paper metadata (Phase 2)
344
- - `GET /api/papers/{paper_id}` - Retrieve paper metadata
345
- - `POST /api/papers/query` - Query papers by discipline, keywords
346
- - `POST /api/papers/{id}/link-podcast/{id}` - Link paper to podcast
347
-
348
- ### Admin Endpoints
349
- - `GET /api/admin/subscribers` - List all subscribers and statistics
350
- - `POST /api/admin/podcasts/fix-missing-titles` - Content maintenance
351
- - `GET /api/admin/podcasts/catalog` - Full podcast catalog
352
-
353
- ---
354
-
355
- ## 🚀 Development Roadmap
356
-
357
- ### ✅ Phase 1: Core Platform (Complete)
358
- - Multi-API research integration
359
- - AI podcast generation with multi-voice synthesis
360
- - RSS feed distribution
361
- - Subscriber platform
362
- - Basic video generation (static)
363
-
364
- ### 🔄 Phase 2: Content Enhancement (In Progress)
365
- - **Research Papers Metadata Database** - JSON-based metadata repository
366
- - **Visual Content Extraction** - Figures from papers, web scraping
367
- - **YouTube Video Quoting** - External video integration with attribution
368
- - **Advanced Video Features** - Multi-layer composition, animations
369
-
370
- ### 📋 Phase 3: Advanced Visualizations (Planned)
371
- - Scientific animation generation (matplotlib, plotly)
372
- - Real-time data visualization
373
- - Mathematical formula rendering
374
- - Dynamic graph and network visualizations
375
-
376
- ### ✅ Phase 4: Knowledge Integration (Implemented - December 2025)
377
- - **Knowledge Graph:** Fully operational with interactive visualization, 12,000+ papers indexed
378
- - **Vector Search:** Semantic search implemented using Vertex AI embeddings
379
- - **RAG System:** Retrieval-augmented generation with citations operational
380
- - **Cross-Disciplinary Pattern Discovery:** Relationship extraction across papers, concepts, and categories
381
- - **AI-Powered Content Recommendations:** Integrated into unified web dashboard
382
- - **Public Project Interface:** https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html
383
- - **Research Tools Dashboard:** https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine
384
-
385
- ---
386
-
387
- ## 🔬 Collaborative Research Tools
388
-
389
- **These platforms enable collective participation and collaboration across diverse user communities:**
390
-
391
- - **Researchers** - Tools for hypothesis formation and testing, rapid synthesis of cross-disciplinary findings
392
- - **Collaborators** - Collective knowledge exploration and refinement
393
- - **Subscribers** - Prompt, generate, and share podcasts (public or private)
394
- - **Community** - User suggestions, comments, and collaborative flowchart improvement (GLMP)
395
-
396
- **Key Innovations:**
397
- - **Multi-Source Validation** - Requires minimum 3 research sources per episode
398
- - **Evidence-Based Generation** - No content generated without research backing
399
- - **Paradigm Shift Detection** - Identifies revolutionary vs. incremental research
400
- - **Interdisciplinary Connections** - Reveals cross-domain insights
401
- - **Collaborative Participation** - User-driven content generation and sharing
402
- - **Reproducibility** - Full citation tracking and source attribution
403
-
404
- > *Like a microscope enables observation of the microscopic world, these tools enable observation and exploration of humanity's collective knowledge.*
405
-
406
- ---
407
 
408
  ## 📚 Prior Work & Research Contributions
409
 
410
  ### Overview
411
- This platform represents **prior work** that demonstrates foundational research and development achievements in AI-powered scientific knowledge synthesis, collaborative research tools, and multi-modal content generation. These contributions establish the technical foundation and proof-of-concept for the broader **CopernicusAI Knowledge Engine** initiative.
412
-
413
- ### Research Contributions
414
-
415
- **1. AI-Powered Research Synthesis System**
416
- - Developed and deployed a production-ready system for multi-source research synthesis using LLMs
417
- - Demonstrated integration of 8+ academic databases (250+ million papers) with parallel search capabilities
418
- - Implemented evidence-based content generation requiring minimum 3 research sources per output
419
- - Achieved operational deployment with 64+ generated podcast episodes across 5 scientific disciplines
420
-
421
- **2. Multi-Model LLM Architecture**
422
- - Designed and implemented intelligent model selection framework using Google Gemini 3, OpenAI GPT-4, and Anthropic Claude 3
423
- - Developed fallback chains for reliability and cost optimization
424
- - Demonstrated paradigm shift detection and interdisciplinary connection identification in research domains
425
- - Implemented entity extraction (genes, proteins, chemical compounds, mathematical concepts) from research literature
426
-
427
- **3. Collaborative Research Platform Infrastructure**
428
- - Built subscriber-driven content generation system enabling public/private research sharing
429
- - Implemented RSS feed distribution compatible with major podcast platforms
430
- - Developed microservices architecture on Google Cloud Run with Firestore and Cloud Storage
431
- - Created RESTful API framework for programmatic access to research synthesis capabilities
432
-
433
- **4. Integration with Knowledge Engine Components**
434
- - Established integration pathways with GLMP (Genome Logic Modeling Project) for biological process visualization
435
- - Designed architecture for Research Papers Metadata Database (Phase 2)
436
- - Planned integration with Science Video Database for multi-modal content
437
- - Created framework for Programming Framework integration across disciplines
438
-
439
- ### Technical Achievements
440
-
441
- **Production Deployment:**
442
- - Live platform: https://www.copernicusai.fyi
443
- - Operational API: https://copernicus-podcast-api-phzp4ie2sq-uc.a.run.app
444
- - RSS feed distribution: Active and functional
445
- - Multi-voice audio synthesis: ElevenLabs TTS integration operational
446
-
447
- **Research Infrastructure:**
448
- - 250+ million research papers accessible via integrated APIs
449
- - 8+ academic database integrations (PubMed, arXiv, NASA ADS, Zenodo, bioRxiv, CORE, Google Scholar, News API)
450
- - **23,246 papers indexed** with full metadata and vector embeddings in Knowledge Engine
451
- - Automated citation extraction and formatting
452
- - Quality scoring and relevance ranking systems
453
- - **Knowledge Graph:** Fully operational with relationship extraction and interactive visualization
454
- - **Vector Search:** Semantic search across papers, podcasts, and processes
455
- - **RAG System:** Operational with citation support and multi-modal content integration
456
-
457
- **Scalability & Architecture:**
458
- - Serverless microservices architecture (Google Cloud Run)
459
- - NoSQL database (Firestore) for flexible metadata storage
460
- - Cloud Storage for media files and structured data
461
- - Event-driven processing with Cloud Functions and Cloud Tasks
462
-
463
- ### Position Within CopernicusAI Knowledge Engine
464
-
465
- This platform serves as the **core synthesis and distribution component** of the CopernicusAI Knowledge Engine. The Knowledge Engine is an integrated ecosystem of research and collaboration tools that work together to assist scientists in their workflow, from research discovery through knowledge synthesis to multi-format content generation.
466
-
467
- **Current Components:**
468
- 1. **CopernicusAI** (This platform) - Core synthesis and distribution component for AI-powered research synthesis and podcast generation
469
- 2. **Research Tools Dashboard** (✅ Implemented December 2025) - Fully operational web interface with knowledge graph visualization, vector search, RAG queries, content browsing, and statistics. Live at: https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine
470
- 3. **Public Project Interface** (✅ Implemented January 2025) - Comprehensive public-facing page providing access to all CopernicusAI Knowledge Engine components. Live at: https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html
471
- 4. **Programming Framework** - Foundational meta-tool providing universal process analysis methodology
472
- 5. **GLMP (Genome Logic Modeling Project)** - Specialized biological application demonstrating domain-specific use of the Programming Framework
473
- 6. **Research Paper Metadata Database** - Core data infrastructure providing structured metadata and citation networks
474
- 7. **Science Video Database** - Multi-modal content component enabling video-based learning and research discovery
475
-
476
- **Future Development:**
477
- The Knowledge Engine is designed to grow and evolve. Additional tools, databases, and collaboration components will be added as the project develops, expanding capabilities for AI-assisted scientific research and knowledge discovery.
478
-
479
- ### Academic & Research Impact
480
-
481
- **Publications & Presentations:**
482
- - Platform architecture and methodology suitable for academic publication
483
- - Open-source components available for research community use
484
- - Publicly accessible research tools demonstrating AI-human collaboration in scientific knowledge synthesis
485
-
486
- **Research Applications:**
487
- - Supports hypothesis formation and testing through rapid multi-source synthesis
488
- - Enables cross-disciplinary pattern recognition and connection identification
489
- - Facilitates reproducible research communication with full citation tracking
490
- - Provides infrastructure for collaborative knowledge exploration
491
-
492
- **Educational Contributions:**
493
- - 64+ research-driven podcast episodes across Biology, Chemistry, Computer Science, Mathematics, and Physics
494
- - Evidence-based content requiring minimum 3 academic sources
495
- - Public and private sharing options for research dissemination
496
- - Integration with major podcast platforms for broad accessibility
497
-
498
- ### Citation Information
499
-
500
- **For Grant Proposals:**
501
- When citing this work as prior research, please reference:
502
-
503
- - **Platform Name:** CopernicusAI - Knowledge Engine for Scientific Discovery
504
- - **URL:** https://huggingface.co/spaces/garywelz/copernicusai
505
- - **Live Platform:** https://www.copernicusai.fyi
506
- - **Primary Developer:** Gary Welz
507
- - **Year:** 2024-2025
508
- - **License:** MIT
509
-
510
- **Suggested Citation Format:**
511
  ```
512
- Welz, G. (2025). CopernicusAI: Knowledge Engine for Scientific Discovery.
513
- Hugging Face Space. https://huggingface.co/spaces/garywelz/copernicusai
514
  ```
515
 
516
- ## 🌐 Grant Support & Collaboration
 
 
 
 
517
 
518
- **Grant Applications Supported:**
519
- This platform is designed to support grant applications to:
520
- - **NSF (National Science Foundation)** - Science education and research infrastructure
521
- - **DOE (Department of Energy)** - Scientific computing and data science
522
- - **SAIR Foundation** - AI research and development initiatives
523
 
524
- **Research Contributions:**
525
- - Open-source components and methodologies
526
- - Publicly accessible research tools
527
- - Educational content for broader scientific literacy
528
- - Infrastructure for reproducible research communication
529
 
530
- **Collaboration Opportunities:**
531
- - Integration with academic institutions
532
- - Partnership with research organizations
533
- - Open data initiatives
534
- - Educational program development
535
 
536
- ---
 
537
 
538
- ## How to Cite This Work
539
 
540
- Welz, G. (2024–2025). *CopernicusAI: AI-Generated Audio Briefings as a Research Interface*.
541
- Hugging Face Spaces. https://huggingface.co/spaces/garywelz/copernicusai
542
 
543
- ---
544
 
545
- ## 📄 License & Attribution
546
 
547
- **License:** MIT
548
 
549
- **Attributions:**
550
- - Built with Google Cloud Platform, Gemini AI, OpenAI, Anthropic Claude, and ElevenLabs
551
- - Research data from PubMed, arXiv, NASA ADS, Zenodo, bioRxiv, CORE, and Google Scholar
552
- - Academic paper metadata from respective publishers
 
553
 
554
  ---
555
 
556
- ## 📧 Contact & Support
557
-
558
- For questions, collaboration inquiries, or grant application support:
559
- - **Hugging Face Space:** [https://huggingface.co/spaces/garywelz/copernicusai](https://huggingface.co/spaces/garywelz/copernicusai)
560
- - **Platform:** [https://www.copernicusai.fyi](https://www.copernicusai.fyi)
561
-
562
- ---
563
 
564
- **© 2025 CopernicusAI. All rights reserved.**
565
 
566
- *Advancing scientific knowledge through AI-powered research communication and discovery.*
 
1
  ---
2
+ title: GLMP - Genome Logic Modeling Project
3
+ emoji: 🧬
4
+ colorFrom: green
5
  colorTo: blue
6
  sdk: static
7
+ pinned: true
8
  license: mit
9
  ---
10
 
11
+ # 🧬 GLMP - Genome Logic Modeling Project
12
 
13
+ A microscope for biological processes. GLMP applies the Programming Framework to visualize complex biochemical processes as interactive flowcharts, revealing the logic of life at the molecular level.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ## 📚 Prior Work & Research Contributions
16
 
17
  ### Overview
18
+ The Genome Logic Modeling Project (GLMP) represents **prior work** that demonstrates the first successful application of the Programming Framework to biological process visualization. This research establishes a novel methodology for transforming complex biochemical processes into structured, interactive visual flowcharts using LLM-powered analysis and Mermaid visualization technology.
19
+
20
+ ### 🔬 Research Contributions
21
+ - **Biological Process Visualization:** 50+ processes mapped across 6 major categories
22
+ - **LLM-Powered Analysis:** Automated extraction using Google Gemini 2.0 Flash
23
+ - **Interactive Visualization:** Mermaid.js-based dynamic flowchart system
24
+ - **Knowledge Engine Integration:** Links to CopernicusAI and Programming Framework
25
+
26
+ ### ⚙️ Technical Achievements
27
+ - **Structured Database:** JSON format in Google Cloud Storage
28
+ - **Process Coverage:** Central Dogma, Metabolism, Signaling, Proteins, Photosynthesis, DNA Repair
29
+ - **Scalable Architecture:** GCS-based storage with web viewer integration
30
+ - **Metadata-Rich Format:** Categories, versions, references, source papers
31
+
32
+ ### 🎯 Position Within CopernicusAI Knowledge Engine
33
+ GLMP serves as a **specialized application component** of the CopernicusAI Knowledge Engine, demonstrating how the Programming Framework can be applied to domain-specific scientific visualization. It integrates with:
34
+
35
+ - Programming Framework (meta-tool)
36
+ - CopernicusAI (main knowledge engine)
37
+ - **Research Tools Dashboard** (✅ Implemented December 2025) - Fully operational web interface with knowledge graph visualization, vector search, RAG queries, and content browsing. Live at: https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine
38
+ - Research Papers Metadata Database
39
+ - Science Video Database
40
+ - Multi-modal learning integration
41
+
42
+ This work establishes a proof-of-concept for domain-specific applications of the Programming Framework, demonstrating its utility in biological sciences and potential for extension to other scientific disciplines. The Knowledge Engine now provides a unified interface for exploring biological processes alongside research papers, podcasts, and other content types.
43
+
44
+ ## 🎯 What is GLMP?
45
+
46
+ The Genome Logic Modeling Project is the first specialized application of the [Programming Framework](https://huggingface.co/spaces/garywelz/programming_framework) to the domain of biology. It transforms complex biochemical processes into clear, visual flowcharts that reveal the step-by-step logic underlying life's molecular machinery.
47
+
48
+ ### Key Features
49
+
50
+ - **50+ Biological Processes** mapped as interactive flowcharts
51
+ - **JSON-based storage** in Google Cloud Storage
52
+ - **LLM-powered analysis** using Google Gemini 2.0
53
+ - **Mermaid visualization** for clear, interactive diagrams
54
+ - **Integration with CopernicusAI** for enhanced learning
55
+
56
+ ## 📚 Process Categories
57
+
58
+ ### 🧬 Central Dogma
59
+ - DNA Replication
60
+ - Transcription
61
+ - Translation
62
+ - RNA Processing
63
+ - Post-translational Modifications
64
+
65
+ ### Metabolic Pathways
66
+ - Glycolysis
67
+ - Krebs Cycle (TCA)
68
+ - Oxidative Phosphorylation
69
+ - Gluconeogenesis
70
+ - Pentose Phosphate Pathway
71
+
72
+ ### 📡 Cell Signaling
73
+ - MAPK Pathway
74
+ - PI3K/AKT Pathway
75
+ - Wnt Signaling
76
+ - Notch Pathway
77
+ - JAK-STAT Pathway
78
+
79
+ ### 🔄 Protein Processes
80
+ - Protein Folding
81
+ - Ubiquitination
82
+ - Autophagy
83
+ - Proteasome Degradation
84
+ - Chaperone Systems
85
+
86
+ ### 🌱 Photosynthesis
87
+ - Light Reactions
88
+ - Calvin Cycle
89
+ - C4 Pathway
90
+ - CAM Photosynthesis
91
+ - Photorespiration
92
+
93
+ ### 🔧 DNA Repair
94
+ - Base Excision Repair
95
+ - Nucleotide Excision Repair
96
+ - Mismatch Repair
97
+ - Double-strand Break Repair
98
+ - Direct Repair
99
+
100
+ ## 🗄️ Database
101
+
102
+ All GLMP flowcharts are stored as JSON files in Google Cloud Storage:
103
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
  ```
105
+ gs://regal-scholar-453620-r7-podcast-storage/glmp-v2/
 
106
  ```
107
 
108
+ Each file contains:
109
+ - Process name and description
110
+ - Mermaid flowchart syntax
111
+ - Metadata (category, version, references)
112
+ - Links to source papers
113
 
114
+ ## 🚀 How to Use
 
 
 
 
115
 
116
+ 1. **Browse the interactive viewer** on the main page
117
+ 2. **Select a biological process** from the dropdown
118
+ 3. **Explore the flowchart** showing each step and decision point
119
+ 4. **Link to source papers** for deeper understanding
120
+ 5. **Integrate with CopernicusAI podcasts** for audio learning
121
 
122
+ ## 🔗 Related Projects
 
 
 
 
123
 
124
+ - [Programming Framework](https://huggingface.co/spaces/garywelz/programming_framework) - The meta-tool powering GLMP
125
+ - [CopernicusAI](https://huggingface.co/spaces/garywelz/copernicusai) - Knowledge engine integrating GLMP with AI podcasts
126
 
127
+ ### How to Cite This Work
128
 
129
+ Welz, G. (2024–2025). *Genome Logic Modeling Project (GLMP)*.
130
+ Hugging Face Spaces. https://huggingface.co/spaces/garywelz/glmp
131
 
132
+ This project serves as a testbed for integrating AI systems into scientific reasoning pipelines, enabling both human and AI agents to analyze, compare, and extend biological knowledge structures.
133
 
134
+ GLMP is designed as infrastructure for AI-assisted science, not as a static visualization collection.
135
 
136
+ ## 💻 Technology Stack
137
 
138
+ - **LLM**: Google Gemini 2.0 Flash
139
+ - **Visualization**: Mermaid.js
140
+ - **Storage**: Google Cloud Storage
141
+ - **Format**: JSON
142
+ - **Frontend**: Static HTML + Tailwind CSS
143
 
144
  ---
145
 
146
+ **Part of the CopernicusAI Knowledge Engine**
 
 
 
 
 
 
147
 
148
+ © 2025 Gary Welz. All rights reserved.
149
 
 
index.html CHANGED
@@ -3,11 +3,12 @@
3
  <head>
4
  <meta charset="UTF-8">
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>CopernicusAI - Research-Driven Podcast Generation Platform</title>
7
  <script src="https://cdn.tailwindcss.com"></script>
 
8
  <style>
9
  .gradient-bg {
10
- background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
11
  }
12
  .card-hover {
13
  transition: transform 0.3s ease, box-shadow 0.3s ease;
@@ -16,12 +17,8 @@
16
  transform: translateY(-4px);
17
  box-shadow: 0 20px 40px rgba(0,0,0,0.15);
18
  }
19
- .stat-number {
20
- font-size: 2.5rem;
21
- font-weight: bold;
22
- background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
23
- -webkit-background-clip: text;
24
- -webkit-text-fill-color: transparent;
25
  }
26
  </style>
27
  </head>
@@ -30,710 +27,295 @@
30
  <header class="gradient-bg text-white">
31
  <div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-16">
32
  <div class="text-center">
33
- <div class="text-6xl mb-4">🔬</div>
34
- <h1 class="text-5xl font-bold mb-4">CopernicusAI</h1>
35
- <p class="text-xl opacity-90 mb-6">Knowledge Engine for Scientific Discovery</p>
36
- <p class="text-lg opacity-75 max-w-4xl mx-auto">
37
- A collaborative research platform that transforms cutting-edge scientific research into accessible,
38
- multi-format tools for collective knowledge exploration. These are research instruments—like microscopes
39
- for observing the collective knowledge of humanity—enabling hypothesis formation, testing, and discovery
40
- across scientific disciplines.
41
  </p>
42
  </div>
43
  </div>
44
  </header>
45
 
46
- <!-- Abstract/Summary -->
47
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
48
- <div class="bg-white rounded-xl shadow-lg p-8 mb-8 border-l-4 border-purple-600">
49
- <h2 class="text-2xl font-bold text-gray-900 mb-4">📋 Summary</h2>
50
- <p class="text-lg text-gray-700 leading-relaxed mb-3">
51
- <strong>CopernicusAI</strong> is an operational research platform that synthesizes scientific literature from 250+ million papers into AI-generated podcasts, integrates with a knowledge graph of 23,246 indexed papers, and provides collaborative tools for research discovery. The system demonstrates production-ready multi-source research synthesis with full citation tracking and evidence-based content generation requiring minimum 3 research sources per episode.
52
- </p>
53
- <p class="text-gray-600">
54
- The platform includes a fully operational Research Tools Dashboard (deployed December 2025) with interactive knowledge graph visualization, vector search, and RAG capabilities, enabling researchers to explore, query, and synthesize scientific knowledge across disciplines.
55
- </p>
56
- </div>
57
- </section>
58
-
59
- <!-- System Architecture -->
60
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
61
- <div class="bg-white rounded-xl shadow-lg p-8 mb-8">
62
- <h2 class="text-3xl font-bold text-gray-900 mb-6">🏗️ Knowledge Engine Architecture</h2>
63
- <p class="text-lg text-gray-700 leading-relaxed mb-6">
64
- The CopernicusAI Knowledge Engine systematically transforms information into knowledge through integrated capabilities. At its core, a knowledge engine is any system—biological or artificial—that systematically transforms information into knowledge, performing work by converting raw materials (information) into useful outputs (knowledge, understanding, insights).
65
- </p>
66
- <p class="text-gray-700 mb-6">
67
- The system architecture demonstrates the integration of data ingestion, processing, storage, and query capabilities across multiple modalities—research papers, process descriptions, and media content—enabling comprehensive knowledge discovery and synthesis.
68
- </p>
69
 
70
- <div class="bg-gray-50 rounded-lg p-6 mb-6">
71
- <img src="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/images/copernicusai_architecture.png"
72
- alt="Knowledge Engine Architecture Diagram showing data ingestion, processing, storage, and query layers"
73
- class="w-full rounded-lg shadow-md"
74
- style="max-width: 100%; height: auto;">
75
- <p class="text-sm text-gray-600 mt-4 italic text-center">
76
- Figure: Knowledge Engine Architecture - Data flow from ingestion through processing and storage to query interfaces
77
  </p>
78
  </div>
79
 
80
- <div class="grid md:grid-cols-3 gap-6 mt-6">
81
- <div class="bg-blue-50 rounded-lg p-6">
82
- <h3 class="text-lg font-semibold text-gray-900 mb-3">📥 Data Ingestion</h3>
83
- <p class="text-sm text-gray-700">
84
- Multi-source acquisition from academic databases (PubMed, arXiv, NASA ADS), literature sources (textbooks, reviews), and educational content (videos, transcripts), with quality assessment and type classification.
85
- </p>
86
- </div>
87
-
88
- <div class="bg-green-50 rounded-lg p-6">
89
- <h3 class="text-lg font-semibold text-gray-900 mb-3">⚙️ Processing & Storage</h3>
90
- <p class="text-sm text-gray-700">
91
- LLM-powered entity extraction and process logic extraction, structured data storage (JSON metadata, Mermaid flowcharts, transcripts), and specialized databases for papers, processes, and media.
92
- </p>
93
  </div>
94
 
95
- <div class="bg-purple-50 rounded-lg p-6">
96
- <h3 class="text-lg font-semibold text-gray-900 mb-3">🔍 Query & Output</h3>
97
- <p class="text-sm text-gray-700">
98
- Multiple access interfaces including RAG queries, vector search, knowledge graph visualization, API endpoints, and web interfaces, converging to unified knowledge output.
99
- </p>
 
 
 
100
  </div>
101
  </div>
102
- </div>
103
- </section>
104
 
105
- <!-- Prior Work: CopernicusAI Research Interface -->
106
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
107
- <div class="bg-gradient-to-r from-purple-50 to-blue-50 rounded-xl shadow-lg p-8 mb-8">
108
- <h2 class="text-3xl font-bold text-gray-900 mb-4">Prior Work & Current Status</h2>
109
-
110
- <div class="bg-white rounded-lg p-6 mb-6">
111
- <h3 class="text-xl font-semibold text-gray-900 mb-3">Prior Work (2024-2025)</h3>
112
- <p class="text-lg text-gray-700 leading-relaxed mb-4">
113
- CopernicusAI is an active research prototype exploring AI-generated audio briefings as an interface for assisted scientific research.
114
- </p>
115
- <p class="text-gray-700 mb-4">
116
- The system allows any user to generate, refine, and share AI-generated science podcasts based on structured prompts, enabling rapid orientation to a topic, iterative deepening, and personalized research briefings.
117
- </p>
118
- <p class="text-gray-700 mb-4">
119
- Rather than functioning as a static content platform, CopernicusAI supports collectively generated and shared research artifacts, analogous to community-driven knowledge platforms (e.g., discussion forums), but grounded in scientific sources and metadata-aware workflows.
120
  </p>
121
- <div class="bg-blue-50 rounded-lg p-4 mt-4">
122
- <h3 class="font-semibold text-gray-900 mb-2">This work demonstrates technical feasibility for:</h3>
123
  <ul class="text-gray-700 space-y-1">
124
- <li>• AI-assisted research briefing and orientation</li>
125
- <li>• Iterative question refinement via conversational interfaces</li>
126
- <li>• Integration of text, audio, and metadata in research workflows</li>
 
 
 
 
127
  </ul>
128
  </div>
129
- </div>
130
-
131
- <div class="bg-green-50 border-2 border-green-200 rounded-lg p-6">
132
- <h3 class="text-xl font-semibold text-gray-900 mb-3">Current Implementation (December 2025)</h3>
133
- <p class="text-gray-700 mb-3">
134
- The Research Tools Dashboard is <strong>fully operational</strong> and deployed to Google Cloud Run, providing unified access to all components with interactive knowledge graph visualization, vector search, RAG queries, and content browsing.
135
- </p>
136
- <p class="text-sm text-gray-600">
137
- See the "Knowledge Engine Ecosystem" section below for details.
138
  </p>
139
  </div>
140
  </div>
141
  </section>
142
 
143
- <!-- Mission & Vision -->
144
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
145
- <div class="bg-white rounded-xl shadow-lg p-8 mb-8">
146
- <h2 class="text-3xl font-bold text-gray-900 mb-4">🎯 Mission & Vision</h2>
147
- <p class="text-lg text-gray-700 leading-relaxed mb-4">
148
- Inspired by Nicolaus Copernicus who challenged accepted knowledge with evidence and rigorous analysis,
149
- <strong>CopernicusAI</strong> creates collaborative research tools that enable collective participation in
150
- scientific discovery. These platforms are instruments for exploring humanity's collective knowledge—tools for
151
- hypothesis formation, testing, and collaborative research, not just educational content.
152
- </p>
153
- <p class="text-gray-600">
154
- Just as a microscope enables observation of the microscopic world, CopernicusAI tools enable observation and
155
- exploration of humanity's collective knowledge. Subscribers collaborate to prompt, generate, and refine research
156
- content—sharing discoveries publicly or keeping them private. As large language models (LLMs) and AI systems
157
- gain unprecedented knowledge, CopernicusAI provides the infrastructure for human-AI collaborative knowledge
158
- exploration, with evidence-based truth-seeking as our guiding principle.
159
- </p>
 
 
160
  </div>
161
  </section>
162
 
163
- <!-- Knowledge Engine Ecosystem -->
164
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
165
- <div class="bg-gradient-to-r from-purple-50 to-indigo-50 rounded-xl shadow-lg p-8 mb-8">
166
- <h2 class="text-3xl font-bold text-gray-900 mb-6 text-center">🧩 CopernicusAI Knowledge Engine</h2>
167
- <p class="text-lg text-gray-700 mb-6 text-center max-w-4xl mx-auto">
168
- An integrated ecosystem of research and collaboration tools designed to assist scientists in their workflow,
169
- from research discovery through knowledge synthesis to multi-format content generation.
170
- <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html"
171
- target="_blank" rel="noopener noreferrer"
172
- class="text-blue-600 hover:underline font-medium">
173
- View Public Project Interface →
174
- </a>
175
- </p>
176
-
177
- <div class="grid md:grid-cols-2 lg:grid-cols-3 gap-6 mb-6">
178
- <div class="bg-white rounded-lg p-6 card-hover border-2 border-purple-300">
179
- <div class="text-3xl mb-3">🎙️</div>
180
- <h3 class="text-lg font-semibold text-gray-900 mb-2">CopernicusAI Podcast Generation</h3>
181
- <p class="text-sm text-gray-600 mb-3">Synthesis & distribution platform for AI-powered research briefing podcast generation</p>
182
- <a href="https://www.copernicusai.fyi" target="_blank" rel="noopener noreferrer" class="text-xs text-blue-600 hover:underline">Visit Website →</a>
183
- </div>
184
-
185
- <div class="bg-white rounded-lg p-6 card-hover">
186
- <div class="text-3xl mb-3">🛠️</div>
187
- <h3 class="text-lg font-semibold text-gray-900 mb-2">Programming Framework</h3>
188
- <p class="text-sm text-gray-600 mb-3">Foundational meta-tool for universal process analysis across disciplines</p>
189
- <a href="https://huggingface.co/spaces/garywelz/programming_framework" target="_blank" rel="noopener noreferrer" class="text-xs text-blue-600 hover:underline">Explore →</a>
190
- </div>
191
-
192
- <div class="bg-white rounded-lg p-6 card-hover">
193
- <div class="text-3xl mb-3">🧬</div>
194
- <h3 class="text-lg font-semibold text-gray-900 mb-2">Genome Logic Modeling Project</h3>
195
- <p class="text-sm text-gray-600 mb-3">Mermaid markdown format flowcharts modeling 100+ biochemical processes in Yeast and E. Coli</p>
196
- <a href="https://huggingface.co/spaces/garywelz/glmp" target="_blank" rel="noopener noreferrer" class="text-xs text-blue-600 hover:underline">Explore →</a>
197
- </div>
198
-
199
- <div class="bg-white rounded-lg p-6 card-hover">
200
- <div class="text-3xl mb-3">📚</div>
201
- <h3 class="text-lg font-semibold text-gray-900 mb-2">Research Paper Database</h3>
202
- <p class="text-sm text-gray-600 mb-3">Core data infrastructure for research paper metadata and citation networks</p>
203
- <a href="https://huggingface.co/spaces/garywelz/metadata_database" target="_blank" rel="noopener noreferrer" class="text-xs text-blue-600 hover:underline">Explore →</a>
204
- </div>
205
-
206
- <div class="bg-white rounded-lg p-6 card-hover">
207
- <div class="text-3xl mb-3">🎬</div>
208
- <h3 class="text-lg font-semibold text-gray-900 mb-2">Science Video Database</h3>
209
- <p class="text-sm text-gray-600 mb-3">Multi-modal content with transcript-based search for scientific videos</p>
210
- <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/videos-database-table.html" target="_blank" rel="noopener noreferrer" class="text-xs text-blue-600 hover:underline">Explore →</a>
211
- </div>
212
-
213
- <div class="bg-white rounded-lg p-6 card-hover border-2 border-green-300">
214
- <div class="text-3xl mb-3">🗺️</div>
215
- <h3 class="text-lg font-semibold text-gray-900 mb-2">Research Tools Dashboard</h3>
216
- <p class="text-sm text-gray-600 mb-3">✅ Prototype web interface for testing knowledge graph, vector search, RAG queries, and content browsing</p>
217
- <a href="https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine" target="_blank" rel="noopener noreferrer" class="text-xs text-blue-600 hover:underline">Live System →</a>
218
  </div>
219
  </div>
220
  </div>
221
  </section>
222
 
223
- <!-- Key Statistics -->
224
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
225
- <div class="grid md:grid-cols-2 lg:grid-cols-4 gap-6 mb-12">
226
- <div class="bg-white rounded-lg shadow-md p-6 text-center">
227
- <div class="stat-number mb-2">23,246</div>
228
- <div class="text-gray-600 font-semibold">Research Papers</div>
229
- <div class="text-sm text-gray-500 mt-1">Indexed in Knowledge Engine (As of January 2025)</div>
230
- </div>
231
- <div class="bg-white rounded-lg shadow-md p-6 text-center">
232
- <div class="stat-number mb-2">314</div>
233
- <div class="text-gray-600 font-semibold">Processes</div>
234
- <div class="text-sm text-gray-500 mt-1">Visualized across 6 databases (As of January 2025)</div>
235
- </div>
236
- <div class="bg-white rounded-lg shadow-md p-6 text-center">
237
- <div class="stat-number mb-2">753</div>
238
- <div class="text-gray-600 font-semibold">Videos</div>
239
- <div class="text-sm text-gray-500 mt-1">Science videos indexed (As of January 2025)</div>
240
- </div>
241
  <div class="bg-white rounded-lg shadow-md p-6 text-center">
242
- <div class="stat-number mb-2">79</div>
243
- <div class="text-gray-600 font-semibold">Podcasts</div>
244
- <div class="text-sm text-gray-500 mt-1">Generated across 5 disciplines (As of January 2025)</div>
 
 
 
 
 
245
  </div>
246
  </div>
247
  </section>
248
 
249
- <!-- Core Platform Capabilities -->
250
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
251
- <h2 class="text-3xl font-bold text-gray-900 mb-8 text-center">🌟 Core Platform Capabilities</h2>
252
 
253
- <div class="space-y-8">
254
- <!-- AI Podcast Generation -->
255
- <div class="bg-white rounded-xl shadow-lg p-8">
256
- <div class="flex items-start mb-4">
257
- <span class="text-4xl mr-4">🎙️</span>
258
- <div class="flex-1">
259
- <h3 class="text-2xl font-bold text-gray-900 mb-3">AI-Powered Podcast Generation</h3>
260
- <p class="text-gray-600 mb-4">
261
- Collaborative research platform where subscribers prompt and generate multi-voice AI podcasts
262
- (5-10 minutes) synthesizing research from multiple academic sources. Subscribers can share their
263
- podcasts publicly or keep them private. Evidence-based content generation requiring minimum 3
264
- research sources per episode.
265
- </p>
266
- <div class="grid md:grid-cols-2 gap-4 mt-4">
267
- <div>
268
- <h4 class="font-semibold text-gray-800 mb-2">Key Features:</h4>
269
- <ul class="text-sm text-gray-600 space-y-1">
270
- <li>✓ Comprehensive research integration (8+ databases)</li>
271
- <li>✓ Professional multi-speaker dialogue</li>
272
- <li>✓ AI-generated scientific visualizations</li>
273
- <li>✓ RSS feed distribution</li>
274
- <li>✓ Quality scoring & relevance ranking</li>
275
- <li>✓ Paradigm shift identification</li>
276
- </ul>
277
- </div>
278
- <div>
279
- <h4 class="font-semibold text-gray-800 mb-2">Research Integration:</h4>
280
- <ul class="text-sm text-gray-600 space-y-1">
281
- <li>✓ Real-time discovery from 8+ APIs</li>
282
- <li>✓ Parallel search across databases</li>
283
- <li>✓ Automatic citation extraction</li>
284
- <li>✓ Source validation & verification</li>
285
- <li>✓ Interdisciplinary connection analysis</li>
286
- </ul>
287
- </div>
288
- </div>
289
- </div>
290
- </div>
291
- </div>
292
-
293
- <!-- LLM Integration -->
294
- <div class="bg-white rounded-xl shadow-lg p-8">
295
- <div class="flex items-start mb-4">
296
- <span class="text-4xl mr-4">🤖</span>
297
- <div class="flex-1">
298
- <h3 class="text-2xl font-bold text-gray-900 mb-3">Advanced LLM Integration</h3>
299
- <p class="text-gray-600 mb-4">Multi-model architecture with intelligent model selection:</p>
300
- <div class="grid md:grid-cols-2 gap-4">
301
- <div>
302
- <h4 class="font-semibold text-gray-800 mb-2">Primary Models:</h4>
303
- <ul class="text-sm text-gray-600 space-y-1">
304
- <li>• <strong>Google Gemini 3</strong> - Latest research analysis and content generation</li>
305
- <li>• <strong>OpenAI GPT-4/GPT-3.5</strong> - Content synthesis and quality validation</li>
306
- <li>• <strong>Anthropic Claude 3</strong> (Sonnet, Haiku) - Alternative reasoning paths</li>
307
- <li>• <strong>ElevenLabs TTS</strong> - Multi-voice text-to-speech synthesis</li>
308
- </ul>
309
- </div>
310
- <div>
311
- <h4 class="font-semibold text-gray-800 mb-2">Capabilities:</h4>
312
- <ul class="text-sm text-gray-600 space-y-1">
313
- <li>• Multi-paper analysis & synthesis</li>
314
- <li>• Paradigm shift detection</li>
315
- <li>• Entity extraction (genes, proteins, compounds)</li>
316
- <li>• Citation tracking & cross-references</li>
317
- <li>• Content quality scoring</li>
318
- </ul>
319
- </div>
320
- </div>
321
- </div>
322
- </div>
323
  </div>
324
 
325
- <!-- Research Resources -->
326
- <div class="bg-white rounded-xl shadow-lg p-8">
327
- <div class="flex items-start mb-4">
328
- <span class="text-4xl mr-4">📊</span>
329
- <div class="flex-1">
330
- <h3 class="text-2xl font-bold text-gray-900 mb-3">Research Resource Access</h3>
331
- <p class="text-gray-600 mb-4">
332
- Comprehensive academic database coverage with <strong>250+ million research papers</strong> accessible
333
- through integrated APIs.
334
- </p>
335
- <div class="grid md:grid-cols-2 gap-4">
336
- <div>
337
- <h4 class="font-semibold text-gray-800 mb-2">Academic Databases:</h4>
338
- <ul class="text-sm text-gray-600 space-y-1">
339
- <li>• PubMed/NCBI (~30+ million papers)</li>
340
- <li>• arXiv (~2+ million preprints)</li>
341
- <li>• NASA ADS (~15+ million papers)</li>
342
- <li>• Zenodo (100K+ datasets)</li>
343
- <li>• bioRxiv/medRxiv (preprints)</li>
344
- <li>• CORE (~200+ million papers)</li>
345
- <li>• Google Scholar (comprehensive)</li>
346
- <li>• News API (current events)</li>
347
- <li>• YouTube Data API (academic videos)</li>
348
- </ul>
349
- </div>
350
- </div>
351
- </div>
352
- </div>
353
  </div>
354
 
355
- <!-- Audio and Video Podcast Production -->
356
- <div class="bg-white rounded-xl shadow-lg p-8">
357
- <div class="flex items-start mb-4">
358
- <span class="text-4xl mr-4">🎙️</span>
359
- <div class="flex-1">
360
- <h3 class="text-2xl font-bold text-gray-900 mb-3">Audio and Video Podcast Production</h3>
361
- <p class="text-gray-600 mb-4">
362
- <strong>Operating Audio Podcast System:</strong> Full production and distribution platform for subscriber-generated
363
- podcasts. Users can prompt, generate, publish, and distribute audio podcasts with RSS feed support for
364
- Spotify, Apple Podcasts, and Google Podcasts.
365
- </p>
366
- <div class="bg-green-50 rounded-lg p-4 mb-4">
367
- <h4 class="font-semibold text-gray-800 mb-2">Current Audio Capabilities (Operational):</h4>
368
- <ul class="text-sm text-gray-700 space-y-1">
369
- <li>✓ Multi-voice AI podcast generation</li>
370
- <li>✓ Research-driven content creation</li>
371
- <li>✓ RSS feed distribution</li>
372
- <li>✓ Public and private podcast options</li>
373
- <li>✓ Professional audio quality</li>
374
- </ul>
375
- </div>
376
- <div class="bg-blue-50 rounded-lg p-4 mt-4">
377
- <h4 class="font-semibold text-gray-800 mb-2">Video Production (Future - Phase 2+):</h4>
378
- <p class="text-sm text-gray-700 mb-2">Advanced video features planned for future development:</p>
379
- <ul class="text-sm text-gray-700 space-y-2">
380
- <li>• <strong>Visual Content Integration:</strong> Automated extraction from papers, web scraping, JSON database integration</li>
381
- <li>• <strong>Dynamic Visualizations:</strong> Scientific animations, real-time charts, LaTeX rendering</li>
382
- <li>• <strong>External Video Quoting:</strong> YouTube segment extraction with attribution & fair use compliance</li>
383
- <li>• <strong>Advanced Composition:</strong> Multi-layer video, auto subtitles, text overlays, professional transitions</li>
384
- </ul>
385
- <p class="text-xs text-gray-600 mt-2">
386
- See: <a href="https://huggingface.co/spaces/garywelz/sciencevideodb" class="text-blue-600 hover:underline">Science Video Database</a> - Companion project for research video content management.
387
- </p>
388
- </div>
389
- </div>
390
- </div>
391
  </div>
392
 
393
- <!-- Research Papers Metadata Database -->
394
- <div class="bg-white rounded-xl shadow-lg p-8">
395
- <div class="flex items-start mb-4">
396
- <span class="text-4xl mr-4">📚</span>
397
- <div class="flex-1">
398
- <h3 class="text-2xl font-bold text-gray-900 mb-3">Research Papers Metadata Database (Phase 2)</h3>
399
- <p class="text-gray-600 mb-4">
400
- A centralized <strong>metadata repository</strong> (not a file archive) providing structured JSON objects
401
- with AI-powered preprocessing.
402
- </p>
403
- <div class="grid md:grid-cols-2 gap-4">
404
- <div>
405
- <h4 class="font-semibold text-gray-800 mb-2">Structured JSON Objects:</h4>
406
- <ul class="text-sm text-gray-600 space-y-1">
407
- <li>• DOI, arXiv ID, publication info</li>
408
- <li>• Abstracts & key findings</li>
409
- <li>• Extracted entities (genes, proteins, compounds, equations)</li>
410
- <li>• Citation networks & cross-references</li>
411
- <li>• Paradigm shift indicators</li>
412
- <li>• Quality scores & relevance metrics</li>
413
- </ul>
414
- </div>
415
- <div>
416
- <h4 class="font-semibold text-gray-800 mb-2">AI-Powered Preprocessing:</h4>
417
- <ul class="text-sm text-gray-600 space-y-1">
418
- <li>• LLM-based entity extraction</li>
419
- <li>• Automatic categorization</li>
420
- <li>• Keyword extraction & semantic tagging</li>
421
- <li>• Citation tracking & mapping</li>
422
- <li>• Quality assessment</li>
423
- <li>• RESTful API access</li>
424
- </ul>
425
- </div>
426
- </div>
427
- </div>
428
- </div>
429
  </div>
430
- </div>
431
- </section>
432
 
433
- <!-- Methodological Details -->
434
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
435
- <div class="bg-white rounded-xl shadow-lg p-8 mb-8">
436
- <h2 class="text-3xl font-bold text-gray-900 mb-6">🔬 Methodology & System Design</h2>
437
-
438
- <div class="grid md:grid-cols-2 gap-6 mb-6">
439
- <div class="bg-blue-50 rounded-lg p-6">
440
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Multi-Source Validation Process</h3>
441
- <p class="text-gray-700 mb-3">
442
- The system requires a <strong>minimum of 3 research sources</strong> per podcast episode. Each source is:
443
- </p>
444
- <ul class="text-sm text-gray-700 space-y-2">
445
- <li>• Retrieved from authoritative academic databases (PubMed, arXiv, NASA ADS, etc.)</li>
446
- <li>• Validated for authenticity and publication status</li>
447
- <li>• Scored for quality and relevance to the research topic</li>
448
- <li>• Cross-referenced to verify consistency and eliminate conflicting information</li>
449
- <li>• Processed through parallel API queries for comprehensive coverage</li>
450
- </ul>
451
- </div>
452
-
453
- <div class="bg-green-50 rounded-lg p-6">
454
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Quality Assurance Mechanisms</h3>
455
- <ul class="text-sm text-gray-700 space-y-2">
456
- <li>• <strong>Source Verification:</strong> Automated checking of DOI, arXiv IDs, and publication metadata</li>
457
- <li>• <strong>Relevance Scoring:</strong> LLM-based assessment of paper relevance to query</li>
458
- <li>• <strong>Paradigm Shift Detection:</strong> Identification of revolutionary vs. incremental research</li>
459
- <li>• <strong>Citation Extraction:</strong> Automatic extraction and formatting of citations</li>
460
- <li>• <strong>Content Validation:</strong> Multi-model verification (Gemini, GPT-4, Claude) for accuracy</li>
461
- </ul>
462
- </div>
463
- </div>
464
-
465
- <div class="bg-purple-50 rounded-lg p-6 mb-6">
466
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Citation Extraction & Verification</h3>
467
- <p class="text-gray-700 mb-3">
468
- The system automatically extracts and formats citations from research papers:
469
- </p>
470
- <ul class="text-sm text-gray-700 space-y-2">
471
- <li>• DOI resolution and metadata enrichment</li>
472
- <li>• arXiv ID parsing and preprint identification</li>
473
- <li>• Author, title, and publication information extraction</li>
474
- <li>• Cross-reference linking between related papers</li>
475
- <li>• Citation network analysis for relationship mapping</li>
476
  </ul>
477
  </div>
478
-
479
- <div class="bg-orange-50 rounded-lg p-6">
480
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Paradigm Shift Detection Implementation</h3>
481
- <p class="text-gray-700 mb-3">
482
- The system uses LLM analysis to identify paradigm-shifting research by:
483
- </p>
484
- <ul class="text-sm text-gray-700 space-y-2">
485
- <li>• Analyzing citation patterns and impact metrics</li>
486
- <li>• Detecting novel methodologies or breakthrough discoveries</li>
487
- <li>• Comparing against established knowledge frameworks</li>
488
- <li>• Identifying interdisciplinary connections and cross-domain insights</li>
489
- <li>• Flagging research that challenges existing paradigms</li>
490
  </ul>
491
  </div>
492
  </div>
493
- </section>
494
 
495
- <!-- Technology Stack -->
496
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
497
- <div class="bg-white rounded-xl shadow-lg p-8">
498
- <h2 class="text-3xl font-bold text-gray-900 mb-6">⚙️ Technology Stack</h2>
499
-
500
- <div class="grid md:grid-cols-3 gap-6 mb-6">
501
- <div>
502
- <h3 class="text-lg font-semibold text-gray-800 mb-3">AI & Machine Learning</h3>
503
- <ul class="text-sm text-gray-600 space-y-1">
504
- <li>• Google Gemini 3</li>
505
- <li>• Google Vertex AI (model orchestration)</li>
506
- <li>• OpenAI GPT-4/GPT-3.5</li>
507
- <li>• Anthropic Claude 3</li>
508
- <li>• ElevenLabs TTS</li>
509
- <li>• DALL-E 3</li>
510
- <li>• Cloud Vision API</li>
511
- <li>• Video Intelligence API</li>
512
- </ul>
513
- </div>
514
-
515
- <div>
516
- <h3 class="text-lg font-semibold text-gray-800 mb-3">Backend Infrastructure</h3>
517
- <ul class="text-sm text-gray-600 space-y-1">
518
- <li>• FastAPI (Python)</li>
519
- <li>• Google Cloud Run</li>
520
- <li>• Firestore (NoSQL)</li>
521
- <li>• Cloud Storage</li>
522
- <li>• Cloud Functions</li>
523
- <li>• Cloud Tasks</li>
524
- <li>• Secret Manager</li>
525
- </ul>
526
- </div>
527
-
528
- <div>
529
- <h3 class="text-lg font-semibold text-gray-800 mb-3">Frontend</h3>
530
- <ul class="text-sm text-gray-600 space-y-1">
531
- <li>• Next.js 15.5.7</li>
532
- <li>• Alpine.js</li>
533
- <li>• Tailwind CSS</li>
534
- <li>• Vercel</li>
535
- </ul>
536
- </div>
537
- </div>
538
  </div>
539
  </section>
540
 
541
- <!-- Limitations & Future Work -->
542
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
543
- <div class="bg-white rounded-xl shadow-lg p-8 mb-8">
544
- <h2 class="text-3xl font-bold text-gray-900 mb-6">🔍 Limitations & Future Directions</h2>
545
 
546
- <div class="grid md:grid-cols-2 gap-6">
547
- <div class="bg-yellow-50 rounded-lg p-6 border-l-4 border-yellow-400">
548
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Current Limitations</h3>
549
- <ul class="text-sm text-gray-700 space-y-2">
550
- <li>• <strong>Discipline Coverage:</strong> Currently indexing 23,246 papers across multiple disciplines; expansion to additional disciplines in progress</li>
551
- <li>• <strong>Source Bias:</strong> Coverage depends on database API availability and open access policies</li>
552
- <li>• <strong>LLM Accuracy:</strong> Content generation relies on LLM accuracy; multi-source validation mitigates but doesn't eliminate errors</li>
553
- <li>• <strong>Real-Time Updates:</strong> Knowledge graph updates require manual or scheduled processing cycles</li>
554
- <li>• <strong>Language:</strong> Currently optimized for English-language research papers</li>
555
- </ul>
556
  </div>
557
-
558
- <div class="bg-blue-50 rounded-lg p-6 border-l-4 border-blue-400">
559
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Future Development</h3>
560
- <ul class="text-sm text-gray-700 space-y-2">
561
- <li>• <strong>Multi-Discipline Expansion:</strong> Expanding knowledge graph to Biology, Chemistry, Physics, Computer Science</li>
562
- <li>• <strong>Process Databases:</strong> Creating comprehensive flowchart databases for all 5 disciplines (~50 processes each)</li>
563
- <li>• <strong>Advanced Video Features:</strong> Dynamic visualizations, animations, and multi-layer composition</li>
564
- <li>• <strong>Multi-Language Support:</strong> Extending to non-English research papers</li>
565
- <li>• <strong>Enhanced Validation:</strong> Peer review mechanisms and user feedback integration</li>
566
- <li>• <strong>Real-Time Updates:</strong> Automated continuous knowledge graph updates</li>
567
- </ul>
568
  </div>
569
  </div>
570
  </div>
571
  </section>
572
 
573
- <!-- Research & Collaborative Tools -->
574
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
575
- <div class="bg-gradient-to-r from-green-50 to-blue-50 rounded-xl p-8">
576
- <h2 class="text-3xl font-bold text-gray-900 mb-6">🔬 Collaborative Research Tools</h2>
577
-
578
- <div class="grid md:grid-cols-2 gap-6 mb-6">
579
- <div>
580
- <h3 class="text-xl font-semibold text-gray-800 mb-3">Collaborative Research Tools</h3>
581
- <p class="text-gray-700 mb-3">
582
- These platforms enable collective participation and collaboration across diverse user communities:
583
- </p>
584
- <ul class="text-gray-700 space-y-2">
585
- <li>• <strong>Researchers</strong> - Tools for hypothesis formation and testing, cross-disciplinary synthesis</li>
586
- <li>• <strong>Collaborators</strong> - Collective knowledge exploration and refinement</li>
587
- <li>• <strong>Subscribers</strong> - Prompt, generate, and share podcasts (public or private)</li>
588
- <li>• <strong>Community</strong> - User suggestions, comments, and collaborative flowchart improvement (GLMP)</li>
589
- </ul>
590
- <p class="text-gray-600 mt-4 italic">
591
- Like a microscope enables observation of the microscopic world, these tools enable observation and
592
- exploration of humanity's collective knowledge.
593
- </p>
594
- </div>
595
-
596
- <div>
597
- <h3 class="text-xl font-semibold text-gray-800 mb-3">Key Innovations</h3>
598
- <ul class="text-gray-700 space-y-2">
599
- <li>• Multi-source validation (min 3 sources)</li>
600
- <li>• Evidence-based generation</li>
601
- <li>• Paradigm shift detection</li>
602
- <li>• Interdisciplinary connections</li>
603
- <li>• Multiple expertise levels</li>
604
- <li>• Full citation tracking</li>
605
- </ul>
606
- </div>
607
- </div>
608
  </div>
609
  </section>
610
 
611
- <!-- Prior Work & Research Contributions -->
612
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
613
- <div class="bg-gradient-to-r from-purple-50 to-blue-50 rounded-xl shadow-lg p-8">
614
- <h2 class="text-3xl font-bold text-gray-900 mb-6">📚 Prior Work & Research Contributions</h2>
615
-
616
- <div class="bg-white rounded-lg p-6 mb-6">
617
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Overview</h3>
618
- <p class="text-gray-700 mb-4">
619
- This platform represents <strong>prior work</strong> that demonstrates foundational research and development
620
- achievements in AI-powered scientific knowledge synthesis, collaborative research tools, and multi-modal content
621
- generation. These contributions establish the technical foundation and proof-of-concept for the broader
622
- <strong>CopernicusAI Knowledge Engine</strong> initiative.
623
- </p>
624
- </div>
625
-
626
- <div class="grid md:grid-cols-2 gap-6 mb-6">
627
- <div class="bg-white rounded-lg p-6">
628
- <h3 class="text-lg font-semibold text-gray-900 mb-3">🔬 Research Contributions</h3>
629
- <ul class="text-sm text-gray-700 space-y-2">
630
- <li>• <strong>AI-Powered Research Synthesis:</strong> Production system for multi-source research synthesis using LLMs</li>
631
- <li>• <strong>Multi-Model Architecture:</strong> Intelligent model selection with Gemini 3, GPT-4, Claude 3</li>
632
- <li>• <strong>Collaborative Platform:</strong> Subscriber-driven content generation with public/private sharing</li>
633
- <li>• <strong>Knowledge Engine Integration:</strong> Architecture for Research Papers DB, Video DB, GLMP, Framework</li>
634
- </ul>
635
- </div>
636
-
637
- <div class="bg-white rounded-lg p-6">
638
- <h3 class="text-lg font-semibold text-gray-900 mb-3">⚙️ Technical Achievements</h3>
639
- <ul class="text-sm text-gray-700 space-y-2">
640
- <li>• <strong>250+ Million Papers:</strong> Accessible via 8+ integrated academic databases</li>
641
- <li>• <strong>79 Episodes:</strong> Generated across 5 scientific disciplines</li>
642
- <li>• <strong>Production Deployment:</strong> Live platform with operational API and RSS distribution</li>
643
- <li>• <strong>Scalable Architecture:</strong> Serverless microservices on Google Cloud</li>
644
- </ul>
645
- </div>
646
- </div>
647
-
648
- <div class="bg-white rounded-lg p-6 mb-6">
649
- <h3 class="text-lg font-semibold text-gray-900 mb-3">🎯 Position Within CopernicusAI Knowledge Engine</h3>
650
- <p class="text-gray-700 mb-3">
651
- This platform serves as the <strong>core synthesis and distribution component</strong> of the CopernicusAI Knowledge Engine.
652
- The Knowledge Engine is an integrated ecosystem of research and collaboration tools that work together to assist scientists
653
- in their workflow, from research discovery through knowledge synthesis to multi-format content generation.
654
  </p>
655
- <div class="bg-blue-50 rounded-lg p-4 mb-3">
656
- <h4 class="font-semibold text-gray-900 mb-2">Current Components:</h4>
657
- <div class="grid md:grid-cols-2 gap-4 text-sm">
658
- <ul class="text-gray-700 space-y-1">
659
- <li>1. <strong>CopernicusAI</strong> (This platform) - Core synthesis & distribution</li>
660
- <li>2. <strong>Programming Framework</strong> - Foundational meta-tool</li>
661
- <li>3. <strong>GLMP</strong> - Biological process visualization</li>
662
- </ul>
663
- <ul class="text-gray-700 space-y-1">
664
- <li>4. <strong>Research Paper Metadata Database</strong> - Data infrastructure</li>
665
- <li>5. <strong>Science Video Database</strong> - Multi-modal content</li>
666
- </ul>
667
- </div>
668
- </div>
669
- <div class="bg-purple-50 rounded-lg p-4">
670
- <h4 class="font-semibold text-gray-900 mb-2">Future Development:</h4>
671
- <p class="text-gray-700 text-sm">
672
- The Knowledge Engine is designed to grow and evolve. Additional tools, databases, and collaboration components
673
- will be added as the project develops, expanding capabilities for AI-assisted scientific research and knowledge discovery.
674
- </p>
675
- </div>
676
  </div>
677
 
678
- <div class="bg-blue-50 rounded-lg p-6">
679
- <h3 class="text-lg font-semibold text-gray-900 mb-3">📖 Citation Information</h3>
680
- <p class="text-sm text-gray-700 mb-3">
681
- <strong>For Grant Proposals (NSF/DOE):</strong>
682
  </p>
683
- <div class="bg-white rounded p-4 font-mono text-sm text-gray-800 mb-4">
684
- <p class="mb-2">Welz, G. (2025). CopernicusAI: Knowledge Engine for Scientific Discovery.</p>
685
- <p class="mb-2">Hugging Face Space. https://huggingface.co/spaces/garywelz/copernicusai</p>
686
- <p>Live Platform: https://www.copernicusai.fyi</p>
687
- </div>
688
- <div class="bg-white rounded p-4 mb-4">
689
- <p class="text-sm font-semibold text-gray-700 mb-2">BibTeX Format:</p>
690
- <pre class="bg-gray-900 text-green-400 p-3 rounded text-xs overflow-x-auto"><code>@misc{welz2025copernicusai,
691
- title={CopernicusAI: Knowledge Engine for Scientific Discovery},
692
- author={Welz, Gary},
693
- year={2025},
694
- url={https://huggingface.co/spaces/garywelz/copernicusai},
695
- note={Hugging Face Space, Live Platform: https://www.copernicusai.fyi}
696
- }</code></pre>
697
- </div>
698
- </div>
699
- </div>
700
- </section>
701
-
702
- <!-- Data Availability Statement -->
703
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
704
- <div class="bg-white rounded-xl shadow-lg p-8 mb-8">
705
- <h2 class="text-3xl font-bold text-gray-900 mb-6">📊 Data Availability Statement</h2>
706
-
707
- <div class="bg-blue-50 rounded-lg p-6 mb-4">
708
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Platform Access</h3>
709
- <ul class="text-gray-700 space-y-2">
710
- <li>• <strong>Live Platform:</strong> <a href="https://www.copernicusai.fyi" target="_blank" class="text-blue-600 hover:underline">https://www.copernicusai.fyi</a> (opens in new tab)</li>
711
- <li>• <strong>Public Project Interface:</strong> <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html" target="_blank" class="text-blue-600 hover:underline">https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html</a> (opens in new tab)</li>
712
- <li>• <strong>Research Tools Dashboard:</strong> <a href="https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine" target="_blank" class="text-blue-600 hover:underline">https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine</a> (opens in new tab)</li>
713
- <li>• <strong>API Base URL:</strong> <code class="bg-gray-100 px-2 py-1 rounded">https://copernicus-podcast-api-phzp4ie2sq-uc.a.run.app</code></li>
714
- <li>• <strong>RSS Feed:</strong> <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/feeds/copernicus-mvp-rss-feed.xml" target="_blank" class="text-blue-600 hover:underline">Available for public access</a> (opens in new tab)</li>
715
- </ul>
716
- </div>
717
-
718
- <div class="bg-green-50 rounded-lg p-6 mb-4">
719
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Data & Code Availability</h3>
720
- <ul class="text-gray-700 space-y-2">
721
- <li>• <strong>Hugging Face Spaces:</strong> All components accessible at <a href="https://huggingface.co/garywelz" target="_blank" class="text-blue-600 hover:underline">https://huggingface.co/garywelz</a> (opens in new tab)</li>
722
- <li>• <strong>Process Flowcharts (GLMP):</strong> JSON files stored in Google Cloud Storage, accessible via <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/glmp-database-table.html" target="_blank" class="text-blue-600 hover:underline">GLMP Database Table</a> (opens in new tab)</li>
723
- <li>• <strong>Research Paper Metadata:</strong> 23,246 indexed papers with metadata accessible through Research Tools Dashboard</li>
724
- <li>• <strong>API Documentation:</strong> RESTful API endpoints available for programmatic access (see API Documentation section)</li>
725
- </ul>
726
- </div>
727
-
728
- <div class="bg-purple-50 rounded-lg p-6">
729
- <h3 class="text-xl font-semibold text-gray-900 mb-4">Reproducibility Information</h3>
730
- <ul class="text-gray-700 space-y-2">
731
- <li>• <strong>Technology Stack:</strong> All technologies and versions documented in Technology Stack section</li>
732
- <li>• <strong>LLM Models:</strong> Google Gemini 3, OpenAI GPT-4/GPT-3.5, Anthropic Claude 3 (versions specified in documentation)</li>
733
- <li>• <strong>Source Citations:</strong> All podcast episodes include full citations to source papers</li>
734
- <li>• <strong>Metadata:</strong> Complete metadata for all generated content available through API</li>
735
- <li>• <strong>License:</strong> MIT License - see license information in space metadata</li>
736
- </ul>
737
  </div>
738
  </div>
739
  </section>
@@ -744,243 +326,17 @@
744
  <h2 class="text-3xl font-bold text-gray-900 mb-6">How to Cite This Work</h2>
745
  <div class="bg-gray-50 rounded-lg p-6 mb-4">
746
  <p class="text-gray-800 font-mono text-lg leading-relaxed mb-4">
747
- Welz, G. (2024–2025). <em>CopernicusAI: AI-Generated Audio Briefings as a Research Interface</em>.<br>
748
- Hugging Face Spaces. https://huggingface.co/spaces/garywelz/copernicusai
749
  </p>
750
-
751
- <div class="border-t border-gray-300 pt-4 mt-4">
752
- <p class="text-sm font-semibold text-gray-700 mb-2">BibTeX Format:</p>
753
- <pre class="bg-gray-800 text-green-400 p-4 rounded text-sm overflow-x-auto"><code>@misc{welz2025copernicusai,
754
- title={CopernicusAI: AI-Generated Audio Briefings as a Research Interface},
755
- author={Welz, Gary},
756
- year={2024--2025},
757
- url={https://huggingface.co/spaces/garywelz/copernicusai},
758
- note={Hugging Face Space}
759
- }</code></pre>
760
- </div>
761
  </div>
762
- </div>
763
- </section>
764
-
765
- <!-- Grant Support & Collaboration -->
766
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
767
- <div class="bg-white rounded-xl shadow-lg p-8">
768
- <h2 class="text-3xl font-bold text-gray-900 mb-6">🌐 Grant Support & Collaboration</h2>
769
-
770
- <div class="mb-6">
771
- <h3 class="text-xl font-semibold text-gray-800 mb-3">Grant Applications Supported</h3>
772
- <p class="text-gray-700 mb-4">
773
- This platform is designed to support grant applications to:
774
  </p>
775
- <div class="grid md:grid-cols-3 gap-4">
776
- <div class="bg-blue-50 rounded-lg p-4">
777
- <h4 class="font-semibold text-gray-800 mb-2">NSF</h4>
778
- <p class="text-sm text-gray-600">National Science Foundation - Science education and research infrastructure</p>
779
- </div>
780
- <div class="bg-green-50 rounded-lg p-4">
781
- <h4 class="font-semibold text-gray-800 mb-2">DOE</h4>
782
- <p class="text-sm text-gray-600">Department of Energy - Scientific computing and data science</p>
783
- </div>
784
- <div class="bg-purple-50 rounded-lg p-4">
785
- <h4 class="font-semibold text-gray-800 mb-2">SAIR Foundation</h4>
786
- <p class="text-sm text-gray-600">AI research and development initiatives</p>
787
- </div>
788
- </div>
789
- </div>
790
-
791
- <div>
792
- <h3 class="text-xl font-semibold text-gray-800 mb-3">Collaboration Opportunities</h3>
793
- <ul class="text-gray-700 space-y-2">
794
- <li>• Integration with academic institutions</li>
795
- <li>• Partnership with research organizations</li>
796
- <li>• Open data initiatives</li>
797
- <li>• Educational program development</li>
798
- </ul>
799
- </div>
800
- </div>
801
- </section>
802
-
803
- <!-- Links & Resources -->
804
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
805
- <div class="bg-gradient-to-r from-blue-50 to-purple-50 rounded-xl p-8">
806
- <h2 class="text-3xl font-bold text-gray-900 mb-6 text-center">🔗 Live Platform & Resources</h2>
807
-
808
- <div class="grid md:grid-cols-2 gap-6">
809
- <div class="bg-white rounded-lg p-6">
810
- <h3 class="text-xl font-semibold text-gray-800 mb-4">🌐 Production Deployment</h3>
811
- <ul class="space-y-2">
812
- <li>
813
- <a href="https://www.copernicusai.fyi" target="_blank" rel="noopener noreferrer"
814
- class="text-blue-600 hover:text-blue-800 font-medium">
815
- 🏠 Homepage - Browse Podcasts (opens in new tab)
816
- </a>
817
- </li>
818
- <li>
819
- <a href="https://www.copernicusai.fyi/subscriber-dashboard.html" target="_blank" rel="noopener noreferrer"
820
- class="text-blue-600 hover:text-blue-800 font-medium">
821
- 📊 Creator Dashboard (opens in new tab)
822
- </a>
823
- </li>
824
- <li>
825
- <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/feeds/copernicus-mvp-rss-feed.xml" target="_blank" rel="noopener noreferrer"
826
- class="text-blue-600 hover:text-blue-800 font-medium">
827
- 📡 RSS Feed (opens in new tab)
828
- </a>
829
- </li>
830
- </ul>
831
- </div>
832
-
833
- <div class="bg-white rounded-lg p-6">
834
- <h3 class="text-xl font-semibold text-gray-800 mb-4">🧩 Knowledge Engine Components</h3>
835
- <p class="text-sm text-gray-600 mb-4">
836
- The CopernicusAI Knowledge Engine is an integrated ecosystem of research and collaboration tools.
837
- The <strong>Research Tools Dashboard is now fully operational</strong> (December 2025) with a working web interface providing unified access to all components.
838
- </p>
839
- <div class="bg-green-50 rounded-lg p-4 mb-4">
840
- <h4 class="font-semibold text-gray-800 mb-2">✅ Research Tools Dashboard (Implemented)</h4>
841
- <p class="text-sm text-gray-700 mb-2">
842
- Fully operational web interface with knowledge graph visualization (23,246 papers), vector search, RAG queries, and content browsing.
843
- </p>
844
- <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/copernicusai-public-reviewer.html"
845
- target="_blank" rel="noopener noreferrer"
846
- class="text-blue-600 hover:underline text-sm font-medium">
847
- Public Project Interface → (opens in new tab)
848
- </a>
849
- <br>
850
- <a href="https://copernicus-frontend-phzp4ie2sq-uc.a.run.app/knowledge-engine"
851
- target="_blank" rel="noopener noreferrer"
852
- class="text-blue-600 hover:underline text-sm font-medium">
853
- Research Tools Dashboard → (opens in new tab)
854
- </a>
855
- </div>
856
- <ul class="space-y-3">
857
- <li>
858
- <a href="https://huggingface.co/spaces/garywelz/programming_framework" target="_blank" rel="noopener noreferrer"
859
- class="text-blue-600 hover:text-blue-800 font-medium">
860
- 🛠️ Programming Framework (opens in new tab)
861
- </a>
862
- <p class="text-sm text-gray-600 mt-1 ml-6">
863
- Foundational meta-tool for universal process analysis across any discipline
864
- </p>
865
- </li>
866
- <li>
867
- <a href="https://huggingface.co/spaces/garywelz/glmp" target="_blank" rel="noopener noreferrer"
868
- class="text-blue-600 hover:text-blue-800 font-medium">
869
- 🧬 GLMP - Genome Logic Modeling Project (opens in new tab)
870
- </a>
871
- <p class="text-sm text-gray-600 mt-1 ml-6">
872
- First application of Programming Framework to biology - 50+ biological processes visualized
873
- </p>
874
- </li>
875
- <li>
876
- <a href="https://huggingface.co/spaces/garywelz/metadata_database" target="_blank" rel="noopener noreferrer"
877
- class="text-blue-600 hover:text-blue-800 font-medium">
878
- 📚 Research Paper Metadata Database (opens in new tab)
879
- </a>
880
- <p class="text-sm text-gray-600 mt-1 ml-6">
881
- Core data infrastructure for structured research paper metadata and citation networks
882
- </p>
883
- </li>
884
- <li>
885
- <a href="https://huggingface.co/spaces/garywelz/sciencevideodb" target="_blank" rel="noopener noreferrer"
886
- class="text-blue-600 hover:text-blue-800 font-medium">
887
- 🎬 Science Video Database (opens in new tab)
888
- </a>
889
- <p class="text-sm text-gray-600 mt-1 ml-6">
890
- Multi-modal content component with transcript-based search for scientific videos
891
- </p>
892
- </li>
893
- </ul>
894
- </div>
895
- </div>
896
- </div>
897
- </section>
898
-
899
- <!-- API Endpoints -->
900
- <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
901
- <div class="bg-gray-900 text-white rounded-xl p-8">
902
- <h2 class="text-3xl font-bold mb-6">🔌 API Documentation</h2>
903
- <p class="text-gray-300 mb-6">Base URL: <code class="bg-gray-800 px-2 py-1 rounded">https://copernicus-podcast-api-phzp4ie2sq-uc.a.run.app</code></p>
904
-
905
- <div class="grid md:grid-cols-3 gap-4 text-sm mb-8">
906
- <div>
907
- <h4 class="font-semibold text-blue-300 mb-2">Podcast Generation</h4>
908
- <ul class="space-y-1 text-gray-400">
909
- <li>POST /generate-podcast-with-subscriber</li>
910
- <li>GET /api/subscribers/podcasts/{id}</li>
911
- <li>POST /api/subscribers/podcasts/submit-to-rss</li>
912
- </ul>
913
- </div>
914
-
915
- <div>
916
- <h4 class="font-semibold text-blue-300 mb-2">Research Endpoints</h4>
917
- <ul class="space-y-1 text-gray-400">
918
- <li>POST /api/papers/upload</li>
919
- <li>GET /api/papers/{paper_id}</li>
920
- <li>POST /api/papers/query</li>
921
- <li>POST /api/papers/{id}/link-podcast/{id}</li>
922
- </ul>
923
- </div>
924
-
925
- <div>
926
- <h4 class="font-semibold text-blue-300 mb-2">Admin Endpoints</h4>
927
- <ul class="space-y-1 text-gray-400">
928
- <li>GET /api/admin/subscribers</li>
929
- <li>POST /api/admin/podcasts/fix-missing-titles</li>
930
- <li>GET /api/admin/podcasts/catalog</li>
931
- </ul>
932
- </div>
933
- </div>
934
-
935
- <div class="border-t border-gray-700 pt-6 mt-6">
936
- <h3 class="text-xl font-semibold text-blue-300 mb-4">📝 Example Request</h3>
937
- <div class="bg-gray-800 rounded-lg p-4 mb-4">
938
- <p class="text-gray-400 text-xs mb-2">POST /api/papers/query</p>
939
- <pre class="text-green-400 text-xs overflow-x-auto"><code>{
940
- "discipline": "biology",
941
- "keywords": ["DNA replication", "cell cycle"],
942
- "date_range": {
943
- "start": "2020-01-01",
944
- "end": "2025-01-01"
945
- },
946
- "limit": 10
947
- }</code></pre>
948
- </div>
949
-
950
- <h3 class="text-xl font-semibold text-blue-300 mb-4 mt-6">📤 Example Response</h3>
951
- <div class="bg-gray-800 rounded-lg p-4 mb-4">
952
- <pre class="text-green-400 text-xs overflow-x-auto"><code>{
953
- "status": "success",
954
- "count": 10,
955
- "papers": [
956
- {
957
- "id": "pmid_12345678",
958
- "title": "Mechanisms of DNA Replication...",
959
- "authors": ["Smith, J.", "Doe, A."],
960
- "journal": "Nature",
961
- "year": 2023,
962
- "doi": "10.1038/s41586-023-01234",
963
- "abstract": "..."
964
- }
965
- ]
966
- }</code></pre>
967
- </div>
968
-
969
- <div class="bg-gray-800 rounded-lg p-4 mt-4">
970
- <h4 class="font-semibold text-blue-300 mb-2 text-sm">🔐 Authentication</h4>
971
- <p class="text-gray-400 text-xs mb-2">API uses Bearer token authentication. Include in request headers:</p>
972
- <pre class="text-green-400 text-xs"><code>Authorization: Bearer YOUR_API_TOKEN</code></pre>
973
- </div>
974
-
975
- <div class="bg-gray-800 rounded-lg p-4 mt-4">
976
- <h4 class="font-semibold text-blue-300 mb-2 text-sm">⚡ Rate Limits</h4>
977
- <p class="text-gray-400 text-xs">Standard rate limits apply: 100 requests/minute per API key. Contact for higher limits.</p>
978
- </div>
979
-
980
- <div class="bg-gray-800 rounded-lg p-4 mt-4">
981
- <h4 class="font-semibold text-blue-300 mb-2 text-sm">📚 API Version</h4>
982
- <p class="text-gray-400 text-xs">Current version: v1.0. API is stable and backward-compatible.</p>
983
- </div>
984
  </div>
985
  </div>
986
  </section>
@@ -988,10 +344,16 @@
988
  <!-- Footer -->
989
  <footer class="gradient-bg text-white py-8 mt-12">
990
  <div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 text-center">
991
- <p class="text-lg font-semibold mb-2">CopernicusAI - Advancing Scientific Knowledge</p>
992
- <p class="text-sm opacity-75">Built with Google Cloud, Gemini AI, OpenAI, Anthropic Claude, and ElevenLabs</p>
993
- <p class="text-xs opacity-50 mt-4">&copy; 2025 CopernicusAI. All rights reserved.</p>
994
  </div>
995
  </footer>
 
 
 
 
 
996
  </body>
997
  </html>
 
 
3
  <head>
4
  <meta charset="UTF-8">
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>GLMP - Genome Logic Modeling Project</title>
7
  <script src="https://cdn.tailwindcss.com"></script>
8
+ <script src="https://cdn.jsdelivr.net/npm/mermaid/dist/mermaid.min.js"></script>
9
  <style>
10
  .gradient-bg {
11
+ background: linear-gradient(135deg, #10b981 0%, #059669 100%);
12
  }
13
  .card-hover {
14
  transition: transform 0.3s ease, box-shadow 0.3s ease;
 
17
  transform: translateY(-4px);
18
  box-shadow: 0 20px 40px rgba(0,0,0,0.15);
19
  }
20
+ .process-card {
21
+ border-left: 4px solid #10b981;
 
 
 
 
22
  }
23
  </style>
24
  </head>
 
27
  <header class="gradient-bg text-white">
28
  <div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-16">
29
  <div class="text-center">
30
+ <div class="text-6xl mb-4">🧬</div>
31
+ <h1 class="text-5xl font-bold mb-4">Genome Logic Modeling Project</h1>
32
+ <p class="text-xl opacity-90 mb-6">A Microscope for Biological Processes</p>
33
+ <p class="text-lg opacity-75 max-w-3xl mx-auto">
34
+ GLMP applies the Programming Framework to visualize complex biochemical processes as interactive
35
+ flowcharts, revealing the logic of life at the molecular level.
 
 
36
  </p>
37
  </div>
38
  </div>
39
  </header>
40
 
41
+ <!-- Prior Work & Research Contributions -->
42
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
43
+ <div class="bg-gradient-to-r from-green-50 to-blue-50 rounded-xl shadow-lg p-8 mb-8">
44
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">📚 Prior Work & Research Contributions</h2>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
+ <div class="bg-white rounded-lg p-6 mb-6">
47
+ <h3 class="text-xl font-semibold text-gray-900 mb-4">Overview</h3>
48
+ <p class="text-gray-700 mb-4">
49
+ The Genome Logic Modeling Project (GLMP) represents <strong>prior work</strong> that demonstrates the first
50
+ successful application of the Programming Framework to biological process visualization. This research establishes
51
+ a novel methodology for transforming complex biochemical processes into structured, interactive visual flowcharts
52
+ using LLM-powered analysis and Mermaid visualization technology.
53
  </p>
54
  </div>
55
 
56
+ <div class="grid md:grid-cols-2 gap-6 mb-6">
57
+ <div class="bg-white rounded-lg p-6">
58
+ <h3 class="text-lg font-semibold text-gray-900 mb-3">🔬 Research Contributions</h3>
59
+ <ul class="text-sm text-gray-700 space-y-2">
60
+ <li>• <strong>Biological Process Visualization:</strong> 50+ processes mapped across 6 major categories</li>
61
+ <li>• <strong>LLM-Powered Analysis:</strong> Automated extraction using Google Gemini 2.0 Flash</li>
62
+ <li>• <strong>Interactive Visualization:</strong> Mermaid.js-based dynamic flowchart system</li>
63
+ <li>• <strong>Knowledge Engine Integration:</strong> Links to CopernicusAI and Programming Framework</li>
64
+ </ul>
 
 
 
 
65
  </div>
66
 
67
+ <div class="bg-white rounded-lg p-6">
68
+ <h3 class="text-lg font-semibold text-gray-900 mb-3">⚙️ Technical Achievements</h3>
69
+ <ul class="text-sm text-gray-700 space-y-2">
70
+ <li>• <strong>Structured Database:</strong> JSON format in Google Cloud Storage</li>
71
+ <li>• <strong>Process Coverage:</strong> Central Dogma, Metabolism, Signaling, Proteins, Photosynthesis, DNA Repair</li>
72
+ <li>• <strong>Scalable Architecture:</strong> GCS-based storage with web viewer integration</li>
73
+ <li>• <strong>Metadata-Rich Format:</strong> Categories, versions, references, source papers</li>
74
+ </ul>
75
  </div>
76
  </div>
 
 
77
 
78
+ <div class="bg-white rounded-lg p-6">
79
+ <h3 class="text-lg font-semibold text-gray-900 mb-3">🎯 Position Within CopernicusAI Knowledge Engine</h3>
80
+ <p class="text-gray-700 mb-3">
81
+ GLMP serves as a <strong>specialized application component</strong> of the CopernicusAI Knowledge Engine,
82
+ demonstrating how the Programming Framework can be applied to domain-specific scientific visualization. It integrates with:
 
 
 
 
 
 
 
 
 
 
83
  </p>
84
+ <div class="grid md:grid-cols-2 gap-4 text-sm mb-3">
 
85
  <ul class="text-gray-700 space-y-1">
86
+ <li>• Programming Framework (meta-tool)</li>
87
+ <li>• CopernicusAI (main knowledge engine)</li>
88
+ <li>• Research Papers Metadata Database</li>
89
+ </ul>
90
+ <ul class="text-gray-700 space-y-1">
91
+ <li>• Science Video Database</li>
92
+ <li>• Multi-modal learning integration</li>
93
  </ul>
94
  </div>
95
+ <p class="text-gray-600 text-sm italic">
96
+ This work establishes a proof-of-concept for domain-specific applications of the Programming Framework,
97
+ demonstrating its utility in biological sciences and potential for extension to other scientific disciplines.
 
 
 
 
 
 
98
  </p>
99
  </div>
100
  </div>
101
  </section>
102
 
103
+ <!-- Quick Stats -->
104
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 -mt-8">
105
+ <div class="grid md:grid-cols-4 gap-4">
106
+ <div class="bg-white rounded-lg shadow-lg p-6 text-center">
107
+ <div class="text-3xl font-bold text-green-600">50+</div>
108
+ <div class="text-sm text-gray-600">Biological Processes</div>
109
+ </div>
110
+ <div class="bg-white rounded-lg shadow-lg p-6 text-center">
111
+ <div class="text-3xl font-bold text-blue-600">JSON</div>
112
+ <div class="text-sm text-gray-600">Flowchart Format</div>
113
+ </div>
114
+ <div class="bg-white rounded-lg shadow-lg p-6 text-center">
115
+ <div class="text-3xl font-bold text-purple-600">LLM</div>
116
+ <div class="text-sm text-gray-600">AI-Powered Analysis</div>
117
+ </div>
118
+ <div class="bg-white rounded-lg shadow-lg p-6 text-center">
119
+ <div class="text-3xl font-bold text-orange-600">Mermaid</div>
120
+ <div class="text-sm text-gray-600">Visualization Engine</div>
121
+ </div>
122
  </div>
123
  </section>
124
 
125
+ <!-- What is GLMP -->
126
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
127
+ <div class="bg-white rounded-xl shadow-lg p-8">
128
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">🔬 What is GLMP?</h2>
129
+ <div class="prose max-w-none text-gray-700">
130
+ <p class="text-lg mb-4">
131
+ The Genome Logic Modeling Project is the first specialized application of the
132
+ <a href="https://huggingface.co/spaces/garywelz/programming_framework" class="text-green-600 hover:text-green-700 font-semibold">Programming Framework</a>
133
+ to the domain of biology. It transforms complex biochemical processes into clear, visual flowcharts
134
+ that reveal the step-by-step logic underlying life's molecular machinery.
135
+ </p>
136
+ <div class="grid md:grid-cols-2 gap-6 mt-6">
137
+ <div class="bg-green-50 rounded-lg p-4">
138
+ <h3 class="font-semibold text-gray-900 mb-2">🎯 Purpose</h3>
139
+ <p class="text-sm">Break down biological complexity into understandable visual logic, making advanced
140
+ biochemistry accessible to researchers, students, and AI systems.</p>
141
+ </div>
142
+ <div class="bg-blue-50 rounded-lg p-4">
143
+ <h3 class="font-semibold text-gray-900 mb-2">⚙️ How It Works</h3>
144
+ <p class="text-sm">LLMs analyze scientific literature to extract process steps, decision points, and
145
+ molecular interactions, then encode them as Mermaid flowcharts stored in JSON.</p>
146
+ </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  </div>
148
  </div>
149
  </div>
150
  </section>
151
 
152
+ <!-- GLMP Database Table -->
153
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
154
+ <div class="bg-gradient-to-r from-green-50 to-blue-50 rounded-xl p-8">
155
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">📊 GLMP Database</h2>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
  <div class="bg-white rounded-lg shadow-md p-6 text-center">
157
+ <p class="text-gray-700 mb-6">
158
+ Access the interactive GLMP database table with all available biological processes, metadata, and analysis.
159
+ </p>
160
+ <a href="https://storage.googleapis.com/regal-scholar-453620-r7-podcast-storage/glmp-database-table.html"
161
+ target="_blank"
162
+ class="inline-flex items-center px-8 py-4 border border-transparent text-lg font-medium rounded-md text-white bg-green-600 hover:bg-green-700 transition-colors shadow-lg">
163
+ 🧬 Open GLMP Database Table
164
+ </a>
165
  </div>
166
  </div>
167
  </section>
168
 
169
+ <!-- Process Database -->
170
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
171
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">📚 Process Database</h2>
172
 
173
+ <div class="grid md:grid-cols-2 lg:grid-cols-3 gap-6">
174
+ <!-- Central Dogma Processes -->
175
+ <div class="bg-white rounded-lg shadow-md p-6 process-card card-hover">
176
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🧬 Central Dogma</h3>
177
+ <ul class="space-y-2 text-sm text-gray-600">
178
+ <li>• DNA Replication</li>
179
+ <li> Transcription</li>
180
+ <li>• Translation</li>
181
+ <li>• RNA Processing</li>
182
+ <li>• Post-translational Modifications</li>
183
+ </ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
  </div>
185
 
186
+ <!-- Metabolic Pathways -->
187
+ <div class="bg-white rounded-lg shadow-md p-6 process-card card-hover">
188
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">⚡ Metabolic Pathways</h3>
189
+ <ul class="space-y-2 text-sm text-gray-600">
190
+ <li>• Glycolysis</li>
191
+ <li> Krebs Cycle (TCA)</li>
192
+ <li>• Oxidative Phosphorylation</li>
193
+ <li> Gluconeogenesis</li>
194
+ <li>• Pentose Phosphate Pathway</li>
195
+ </ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
196
  </div>
197
 
198
+ <!-- Cell Signaling -->
199
+ <div class="bg-white rounded-lg shadow-md p-6 process-card card-hover">
200
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">📡 Cell Signaling</h3>
201
+ <ul class="space-y-2 text-sm text-gray-600">
202
+ <li>• MAPK Pathway</li>
203
+ <li> PI3K/AKT Pathway</li>
204
+ <li>• Wnt Signaling</li>
205
+ <li> Notch Pathway</li>
206
+ <li>• JAK-STAT Pathway</li>
207
+ </ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
208
  </div>
209
 
210
+ <!-- Protein Processes -->
211
+ <div class="bg-white rounded-lg shadow-md p-6 process-card card-hover">
212
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🔄 Protein Processes</h3>
213
+ <ul class="space-y-2 text-sm text-gray-600">
214
+ <li>• Protein Folding</li>
215
+ <li> Ubiquitination</li>
216
+ <li>• Autophagy</li>
217
+ <li> Proteasome Degradation</li>
218
+ <li>• Chaperone Systems</li>
219
+ </ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
220
  </div>
 
 
221
 
222
+ <!-- Photosynthesis -->
223
+ <div class="bg-white rounded-lg shadow-md p-6 process-card card-hover">
224
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🌱 Photosynthesis</h3>
225
+ <ul class="space-y-2 text-sm text-gray-600">
226
+ <li>• Light Reactions</li>
227
+ <li>• Calvin Cycle</li>
228
+ <li>• C4 Pathway</li>
229
+ <li> CAM Photosynthesis</li>
230
+ <li>• Photorespiration</li>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
231
  </ul>
232
  </div>
233
+
234
+ <!-- DNA Repair -->
235
+ <div class="bg-white rounded-lg shadow-md p-6 process-card card-hover">
236
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🔧 DNA Repair</h3>
237
+ <ul class="space-y-2 text-sm text-gray-600">
238
+ <li>• Base Excision Repair</li>
239
+ <li>• Nucleotide Excision Repair</li>
240
+ <li>• Mismatch Repair</li>
241
+ <li>• Double-strand Break Repair</li>
242
+ <li>• Direct Repair</li>
 
 
243
  </ul>
244
  </div>
245
  </div>
 
246
 
247
+ <div class="mt-6 text-center">
248
+ <a href="#archive" class="text-green-600 hover:text-green-700 font-semibold">
249
+ 📦 View Archived Processes (v1.0) →
250
+ </a>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
251
  </div>
252
  </section>
253
 
254
+ <!-- How to Use -->
255
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
256
+ <div class="bg-gradient-to-r from-blue-50 to-purple-50 rounded-xl p-8">
257
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">🚀 How to Use GLMP</h2>
258
 
259
+ <div class="grid md:grid-cols-3 gap-6">
260
+ <div class="bg-white rounded-lg p-6">
261
+ <div class="text-3xl mb-3">1️⃣</div>
262
+ <h3 class="font-semibold text-gray-900 mb-2">Select Process</h3>
263
+ <p class="text-sm text-gray-600">Choose a biological process from the viewer above or browse the database</p>
 
 
 
 
 
264
  </div>
265
+
266
+ <div class="bg-white rounded-lg p-6">
267
+ <div class="text-3xl mb-3">2️⃣</div>
268
+ <h3 class="font-semibold text-gray-900 mb-2">View Flowchart</h3>
269
+ <p class="text-sm text-gray-600">Explore the interactive Mermaid visualization showing each step and decision point</p>
270
+ </div>
271
+
272
+ <div class="bg-white rounded-lg p-6">
273
+ <div class="text-3xl mb-3">3️⃣</div>
274
+ <h3 class="font-semibold text-gray-900 mb-2">Learn & Integrate</h3>
275
+ <p class="text-sm text-gray-600">Use for education, research, or integrate with CopernicusAI podcasts</p>
276
  </div>
277
  </div>
278
  </div>
279
  </section>
280
 
281
+ <!-- Archive Link -->
282
+ <section id="archive" class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
283
+ <div class="bg-gray-100 rounded-lg p-6 text-center">
284
+ <h3 class="text-xl font-semibold text-gray-900 mb-2">📦 Archived Versions</h3>
285
+ <p class="text-gray-600 mb-4">Earlier versions of GLMP processes and experimental visualizations</p>
286
+ <a href="/archive.html" class="inline-block bg-green-600 text-white px-6 py-2 rounded-lg hover:bg-green-700 transition">
287
+ View Archive
288
+ </a>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
289
  </div>
290
  </section>
291
 
292
+ <!-- Related Projects -->
293
  <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
294
+ <h2 class="text-3xl font-bold text-gray-900 mb-6 text-center">🔗 Related Projects</h2>
295
+
296
+ <div class="grid md:grid-cols-2 gap-6">
297
+ <div class="bg-white rounded-lg shadow-md p-6 card-hover">
298
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🛠️ Programming Framework</h3>
299
+ <p class="text-gray-600 mb-4">
300
+ The meta-tool that powers GLMP. A universal method for process analysis across any discipline.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
301
  </p>
302
+ <a href="https://huggingface.co/spaces/garywelz/programming_framework"
303
+ class="text-green-600 hover:text-green-700 font-semibold"
304
+ target="_blank" rel="noopener noreferrer">
305
+ Explore Framework
306
+ </a>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
307
  </div>
308
 
309
+ <div class="bg-white rounded-lg shadow-md p-6 card-hover">
310
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🔬 CopernicusAI</h3>
311
+ <p class="text-gray-600 mb-4">
312
+ Knowledge engine that integrates GLMP visualizations with AI-generated scientific podcasts.
313
  </p>
314
+ <a href="https://www.copernicusai.fyi"
315
+ class="text-green-600 hover:text-green-700 font-semibold"
316
+ target="_blank" rel="noopener noreferrer">
317
+ Visit CopernicusAI
318
+ </a>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
319
  </div>
320
  </div>
321
  </section>
 
326
  <h2 class="text-3xl font-bold text-gray-900 mb-6">How to Cite This Work</h2>
327
  <div class="bg-gray-50 rounded-lg p-6 mb-4">
328
  <p class="text-gray-800 font-mono text-lg leading-relaxed mb-4">
329
+ Welz, G. (2024–2025). <em>Genome Logic Modeling Project (GLMP)</em>.<br>
330
+ Hugging Face Spaces. https://huggingface.co/spaces/garywelz/glmp
331
  </p>
 
 
 
 
 
 
 
 
 
 
 
332
  </div>
333
+ <div class="bg-green-50 rounded-lg p-4">
334
+ <p class="text-gray-700 mb-2">
335
+ This project serves as a testbed for integrating AI systems into scientific reasoning pipelines, enabling both human and AI agents to analyze, compare, and extend biological knowledge structures.
336
+ </p>
337
+ <p class="text-gray-700 font-semibold">
338
+ GLMP is designed as infrastructure for AI-assisted science, not as a static visualization collection.
 
 
 
 
 
 
339
  </p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
340
  </div>
341
  </div>
342
  </section>
 
344
  <!-- Footer -->
345
  <footer class="gradient-bg text-white py-8 mt-12">
346
  <div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 text-center">
347
+ <p class="text-lg font-semibold mb-2">GLMP - Genome Logic Modeling Project</p>
348
+ <p class="text-sm opacity-75">Part of the CopernicusAI Knowledge Engine</p>
349
+ <p class="text-xs opacity-50 mt-4">&copy; 2025 Gary Welz. All rights reserved.</p>
350
  </div>
351
  </footer>
352
+
353
+ <script>
354
+ // Simple script for any interactive elements if needed
355
+ console.log('GLMP Space loaded - using direct GCS viewer integration');
356
+ </script>
357
  </body>
358
  </html>
359
+