garywelz commited on
Commit
90b9cf3
Β·
verified Β·
1 Parent(s): c0efcd4

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +105 -7
  2. index.html +242 -18
README.md CHANGED
@@ -1,12 +1,110 @@
1
  ---
2
- title: Metadata Database
3
- emoji: πŸŒ–
4
- colorFrom: pink
5
- colorTo: green
6
  sdk: static
7
  pinned: false
8
- license: cc-by-sa-4.0
9
- short_description: Research Metadata Database
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Research Paper Metadata Database
3
+ emoji: πŸ“š
4
+ colorFrom: blue
5
+ colorTo: indigo
6
  sdk: static
7
  pinned: false
8
+ license: apache-2.0
 
9
  ---
10
 
11
+ # πŸ“š Research Paper Metadata Database
12
+
13
+ A centralized metadata repository for scientific research papers, designed to enable AI-powered visualization and analysis of research structure with the goal of expanding research in interesting, useful, and practical ways.
14
+
15
+ ## πŸ“š Prior Work & Research Contributions
16
+
17
+ ### Overview
18
+ The Research Paper Metadata Database represents **prior work** that demonstrates the creation of a structured metadata repository for scientific research papers. This project establishes a foundation for using AI tools to visualize and analyze the structure of scientific research, enabling systematic exploration of research patterns, citation networks, and interdisciplinary connections.
19
+
20
+ ### πŸ”¬ Research Contributions
21
+ - **Structured Metadata Repository:** Centralized database of research paper metadata (not a file archive)
22
+ - **AI-Powered Preprocessing:** LLM-based entity extraction and annotation for research papers
23
+ - **Citation Network Analysis:** Cross-reference linking and relationship mapping between papers
24
+ - **Integration Framework:** Designed for integration with CopernicusAI Knowledge Engine components
25
+
26
+ ### βš™οΈ Technical Achievements
27
+ - **JSON-Based Storage:** Structured metadata format enabling programmatic access and analysis
28
+ - **Entity Extraction:** Automated extraction of genes, proteins, chemical compounds, equations, and key concepts
29
+ - **Quality Assessment:** Automated quality scoring and relevance metrics for research papers
30
+ - **API Architecture:** RESTful API design for external access and integration
31
+
32
+ ### 🎯 Position Within CopernicusAI Knowledge Engine
33
+ The Research Paper Metadata Database serves as a **core data infrastructure component** of the CopernicusAI Knowledge Engine, providing:
34
+
35
+ - **Foundation for Knowledge Graph Construction:** Structured metadata enables relationship mapping
36
+ - **Integration with AI Podcast Generation:** Links research papers to generated podcast content
37
+ - **Support for GLMP:** Provides source paper references for biological process visualizations
38
+ - **Science Video Database Integration:** Potential linking between papers and related video content
39
+ - **Programming Framework Support:** Supplies structured data for process analysis applications
40
+
41
+ This work establishes a proof-of-concept for AI-assisted research metadata management, demonstrating how structured data can enable systematic analysis and visualization of scientific research patterns.
42
+
43
+ ## 🎯 Project Goals
44
+
45
+ This project creates a database of scientific research paper metadata for the purpose of:
46
+ - Using AI tools to visualize and analyze the structure of scientific research
47
+ - Expanding research in interesting, useful, and practical ways
48
+ - Enabling systematic exploration of research patterns and connections
49
+ - Supporting knowledge graph construction and semantic search
50
+
51
+ ## πŸ”§ Technical Architecture
52
+
53
+ ### Metadata Structure
54
+ - **DOI, arXiv ID, Publication Information:** Standard identifiers and publication details
55
+ - **Abstracts and Key Findings:** Extracted summaries and main contributions
56
+ - **Extracted Entities:** Genes, proteins, chemical compounds, equations, mathematical concepts
57
+ - **Citation Networks:** Cross-references and relationship mapping
58
+ - **Paradigm Shift Indicators:** Flags for revolutionary vs. incremental research
59
+ - **Interdisciplinary Connections:** Links between different research domains
60
+ - **Quality Scores:** Relevance metrics and validation scores
61
+
62
+ ### AI-Powered Preprocessing
63
+ - LLM-based entity extraction and annotation
64
+ - Automatic categorization by discipline and subdomain
65
+ - Keyword extraction and semantic tagging
66
+ - Citation tracking and relationship mapping
67
+ - Quality assessment and validation
68
+
69
+ ### Integration Features
70
+ - DOI/arXiv ID resolution and metadata enrichment
71
+ - Cross-reference linking between papers
72
+ - Podcast-to-paper relationship tracking
73
+ - Search and query capabilities
74
+ - API access for programmatic retrieval
75
+
76
+ ## πŸ”— Related Projects
77
+
78
+ - [Copernicus AI](https://huggingface.co/spaces/garywelz/copernicusai) - Main knowledge engine integrating metadata with AI podcasts
79
+ - [GLMP](https://huggingface.co/spaces/garywelz/glmp) - Genome Logic Modeling Project using metadata for source references
80
+ - [Programming Framework](https://huggingface.co/spaces/garywelz/programming_framework) - Universal process analysis tool that can utilize metadata
81
+ - [Science Video Database](https://huggingface.co/spaces/garywelz/sciencevideodb) - Video content management with potential metadata linking
82
+
83
+ ## πŸ’» Technology Stack
84
+
85
+ - **Database:** Firestore NoSQL for flexible JSON storage
86
+ - **Processing:** Google Cloud Functions for automated metadata processing
87
+ - **AI/ML:** Vertex AI for entity extraction and analysis
88
+ - **API:** RESTful API for external access
89
+ - **Storage:** Google Cloud Storage for associated assets
90
+
91
+ ## πŸ”— Resources
92
+
93
+ - **GitHub Repository:** [garywelz/copernicusai-research-metadata](https://github.com/garywelz/copernicusai-research-metadata)
94
+ - **Hugging Face Space:** [garywelz/metadata_database](https://huggingface.co/spaces/garywelz/metadata_database)
95
+
96
+ ### How to Cite This Work
97
+
98
+ Welz, G. (2024–2025). *Research Paper Metadata Database*.
99
+ Hugging Face Spaces. https://huggingface.co/spaces/garywelz/metadata_database
100
+
101
+ This project serves as infrastructure for AI-assisted research analysis, enabling systematic visualization and exploration of scientific research patterns through structured metadata management.
102
+
103
+ The Research Paper Metadata Database is designed as infrastructure for AI-assisted science, providing the foundational data layer for knowledge graph construction and semantic search capabilities within the CopernicusAI Knowledge Engine.
104
+
105
+ ---
106
+
107
+ **Part of the Copernicus AI Knowledge Engine**
108
+
109
+ Β© 2025 Gary Welz. All rights reserved.
110
+
index.html CHANGED
@@ -1,19 +1,243 @@
1
- <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  </html>
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Research Paper Metadata Database</title>
7
+ <script src="https://cdn.tailwindcss.com"></script>
8
+ <style>
9
+ .gradient-bg {
10
+ background: linear-gradient(135deg, #3b82f6 0%, #6366f1 100%);
11
+ }
12
+ .card-hover {
13
+ transition: transform 0.3s ease, box-shadow 0.3s ease;
14
+ }
15
+ .card-hover:hover {
16
+ transform: translateY(-4px);
17
+ box-shadow: 0 20px 40px rgba(0,0,0,0.15);
18
+ }
19
+ </style>
20
+ </head>
21
+ <body class="bg-gray-50">
22
+ <!-- Header -->
23
+ <header class="gradient-bg text-white">
24
+ <div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-16">
25
+ <div class="text-center">
26
+ <div class="text-6xl mb-4">πŸ“š</div>
27
+ <h1 class="text-5xl font-bold mb-4">Research Paper Metadata Database</h1>
28
+ <p class="text-xl opacity-90 mb-6">Centralized Metadata Repository for Scientific Research</p>
29
+ <p class="text-lg opacity-75 max-w-3xl mx-auto">
30
+ A structured metadata repository designed to enable AI-powered visualization and analysis
31
+ of research structure with the goal of expanding research in interesting, useful, and practical ways.
32
+ </p>
33
+ </div>
34
+ </div>
35
+ </header>
36
+
37
+ <!-- Prior Work & Research Contributions -->
38
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
39
+ <div class="bg-gradient-to-r from-blue-50 to-indigo-50 rounded-xl shadow-lg p-8 mb-8">
40
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">πŸ“š Prior Work & Research Contributions</h2>
41
+
42
+ <div class="bg-white rounded-lg p-6 mb-6">
43
+ <h3 class="text-xl font-semibold text-gray-900 mb-4">Overview</h3>
44
+ <p class="text-gray-700 mb-4">
45
+ The Research Paper Metadata Database represents <strong>prior work</strong> that demonstrates the creation of a structured
46
+ metadata repository for scientific research papers. This project establishes a foundation for using AI tools to visualize
47
+ and analyze the structure of scientific research, enabling systematic exploration of research patterns, citation networks,
48
+ and interdisciplinary connections.
49
+ </p>
50
+ </div>
51
+
52
+ <div class="grid md:grid-cols-2 gap-6 mb-6">
53
+ <div class="bg-white rounded-lg p-6">
54
+ <h3 class="text-lg font-semibold text-gray-900 mb-3">πŸ”¬ Research Contributions</h3>
55
+ <ul class="text-sm text-gray-700 space-y-2">
56
+ <li>β€’ <strong>Structured Metadata Repository:</strong> Centralized database of research paper metadata</li>
57
+ <li>β€’ <strong>AI-Powered Preprocessing:</strong> LLM-based entity extraction and annotation</li>
58
+ <li>β€’ <strong>Citation Network Analysis:</strong> Cross-reference linking and relationship mapping</li>
59
+ <li>β€’ <strong>Integration Framework:</strong> Designed for CopernicusAI Knowledge Engine integration</li>
60
+ </ul>
61
+ </div>
62
+
63
+ <div class="bg-white rounded-lg p-6">
64
+ <h3 class="text-lg font-semibold text-gray-900 mb-3">βš™οΈ Technical Achievements</h3>
65
+ <ul class="text-sm text-gray-700 space-y-2">
66
+ <li>β€’ <strong>JSON-Based Storage:</strong> Structured metadata format for programmatic access</li>
67
+ <li>β€’ <strong>Entity Extraction:</strong> Automated extraction of genes, proteins, compounds, equations</li>
68
+ <li>β€’ <strong>Quality Assessment:</strong> Automated quality scoring and relevance metrics</li>
69
+ <li>β€’ <strong>API Architecture:</strong> RESTful API design for external access</li>
70
+ </ul>
71
+ </div>
72
+ </div>
73
+
74
+ <div class="bg-white rounded-lg p-6">
75
+ <h3 class="text-lg font-semibold text-gray-900 mb-3">🎯 Position Within CopernicusAI Knowledge Engine</h3>
76
+ <p class="text-gray-700 mb-3">
77
+ The Research Paper Metadata Database serves as a <strong>core data infrastructure component</strong> of the CopernicusAI Knowledge Engine, providing:
78
+ </p>
79
+ <div class="grid md:grid-cols-2 gap-4 text-sm mb-3">
80
+ <ul class="text-gray-700 space-y-1">
81
+ <li>β€’ Foundation for knowledge graph construction</li>
82
+ <li>β€’ Integration with AI podcast generation</li>
83
+ <li>β€’ Support for GLMP source references</li>
84
+ </ul>
85
+ <ul class="text-gray-700 space-y-1">
86
+ <li>β€’ Science Video Database integration</li>
87
+ <li>β€’ Programming Framework support</li>
88
+ </ul>
89
+ </div>
90
+ <p class="text-gray-600 text-sm italic">
91
+ This work establishes a proof-of-concept for AI-assisted research metadata management, demonstrating how structured data
92
+ can enable systematic analysis and visualization of scientific research patterns.
93
+ </p>
94
+ </div>
95
+ </div>
96
+ </section>
97
+
98
+ <!-- Project Goals -->
99
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
100
+ <div class="bg-white rounded-xl shadow-lg p-8">
101
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">🎯 Project Goals</h2>
102
+ <p class="text-gray-700 mb-4">
103
+ This project creates a database of scientific research paper metadata for the purpose of:
104
+ </p>
105
+ <ul class="text-gray-700 space-y-2">
106
+ <li>β€’ Using AI tools to visualize and analyze the structure of scientific research</li>
107
+ <li>β€’ Expanding research in interesting, useful, and practical ways</li>
108
+ <li>β€’ Enabling systematic exploration of research patterns and connections</li>
109
+ <li>β€’ Supporting knowledge graph construction and semantic search</li>
110
+ </ul>
111
+ </div>
112
+ </section>
113
+
114
+ <!-- Technical Architecture -->
115
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
116
+ <div class="bg-white rounded-xl shadow-lg p-8">
117
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">πŸ”§ Technical Architecture</h2>
118
+
119
+ <div class="grid md:grid-cols-3 gap-6">
120
+ <div>
121
+ <h3 class="text-lg font-semibold text-gray-800 mb-3">Metadata Structure</h3>
122
+ <ul class="text-sm text-gray-600 space-y-1">
123
+ <li>β€’ DOI, arXiv ID, publication info</li>
124
+ <li>β€’ Abstracts & key findings</li>
125
+ <li>β€’ Extracted entities</li>
126
+ <li>β€’ Citation networks</li>
127
+ <li>β€’ Paradigm shift indicators</li>
128
+ <li>β€’ Quality scores</li>
129
+ </ul>
130
+ </div>
131
+
132
+ <div>
133
+ <h3 class="text-lg font-semibold text-gray-800 mb-3">AI-Powered Preprocessing</h3>
134
+ <ul class="text-sm text-gray-600 space-y-1">
135
+ <li>β€’ LLM-based entity extraction</li>
136
+ <li>β€’ Automatic categorization</li>
137
+ <li>β€’ Keyword extraction</li>
138
+ <li>β€’ Citation tracking</li>
139
+ <li>β€’ Quality assessment</li>
140
+ </ul>
141
+ </div>
142
+
143
+ <div>
144
+ <h3 class="text-lg font-semibold text-gray-800 mb-3">Integration Features</h3>
145
+ <ul class="text-sm text-gray-600 space-y-1">
146
+ <li>β€’ DOI/arXiv ID resolution</li>
147
+ <li>β€’ Cross-reference linking</li>
148
+ <li>β€’ Podcast-to-paper tracking</li>
149
+ <li>β€’ Search & query capabilities</li>
150
+ <li>β€’ API access</li>
151
+ </ul>
152
+ </div>
153
+ </div>
154
+ </div>
155
+ </section>
156
+
157
+ <!-- Related Projects -->
158
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
159
+ <h2 class="text-3xl font-bold text-gray-900 mb-6 text-center">πŸ”— Related Projects</h2>
160
+
161
+ <div class="grid md:grid-cols-2 gap-6">
162
+ <div class="bg-white rounded-lg shadow-md p-6 card-hover">
163
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">πŸ”¬ Copernicus AI</h3>
164
+ <p class="text-gray-600 mb-4">
165
+ Main knowledge engine integrating metadata with AI podcasts and research synthesis.
166
+ </p>
167
+ <a href="https://huggingface.co/spaces/garywelz/copernicusai"
168
+ class="text-blue-600 hover:text-blue-700 font-semibold"
169
+ target="_blank" rel="noopener noreferrer">
170
+ Visit Copernicus AI β†’
171
+ </a>
172
+ </div>
173
+
174
+ <div class="bg-white rounded-lg shadow-md p-6 card-hover">
175
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🧬 GLMP</h3>
176
+ <p class="text-gray-600 mb-4">
177
+ Genome Logic Modeling Project using metadata for source paper references.
178
+ </p>
179
+ <a href="https://huggingface.co/spaces/garywelz/glmp"
180
+ class="text-blue-600 hover:text-blue-700 font-semibold"
181
+ target="_blank" rel="noopener noreferrer">
182
+ Explore GLMP β†’
183
+ </a>
184
+ </div>
185
+
186
+ <div class="bg-white rounded-lg shadow-md p-6 card-hover">
187
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">πŸ› οΈ Programming Framework</h3>
188
+ <p class="text-gray-600 mb-4">
189
+ Universal process analysis tool that can utilize metadata for research analysis.
190
+ </p>
191
+ <a href="https://huggingface.co/spaces/garywelz/programming_framework"
192
+ class="text-blue-600 hover:text-blue-700 font-semibold"
193
+ target="_blank" rel="noopener noreferrer">
194
+ Explore Framework β†’
195
+ </a>
196
+ </div>
197
+
198
+ <div class="bg-white rounded-lg shadow-md p-6 card-hover">
199
+ <h3 class="text-xl font-semibold text-gray-900 mb-3">🎬 Science Video Database</h3>
200
+ <p class="text-gray-600 mb-4">
201
+ Video content management with potential metadata linking.
202
+ </p>
203
+ <a href="https://huggingface.co/spaces/garywelz/sciencevideodb"
204
+ class="text-blue-600 hover:text-blue-700 font-semibold"
205
+ target="_blank" rel="noopener noreferrer">
206
+ Visit Video Database β†’
207
+ </a>
208
+ </div>
209
+ </div>
210
+ </section>
211
+
212
+ <!-- How to Cite This Work -->
213
+ <section class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 py-8">
214
+ <div class="bg-white rounded-xl shadow-lg p-8">
215
+ <h2 class="text-3xl font-bold text-gray-900 mb-6">How to Cite This Work</h2>
216
+ <div class="bg-gray-50 rounded-lg p-6 mb-4">
217
+ <p class="text-gray-800 font-mono text-lg leading-relaxed mb-4">
218
+ Welz, G. (2024–2025). <em>Research Paper Metadata Database</em>.<br>
219
+ Hugging Face Spaces. https://huggingface.co/spaces/garywelz/metadata_database
220
+ </p>
221
+ </div>
222
+ <div class="bg-blue-50 rounded-lg p-4">
223
+ <p class="text-gray-700 mb-2">
224
+ This project serves as infrastructure for AI-assisted research analysis, enabling systematic visualization and exploration of scientific research patterns through structured metadata management.
225
+ </p>
226
+ <p class="text-gray-700 font-semibold">
227
+ The Research Paper Metadata Database is designed as infrastructure for AI-assisted science, providing the foundational data layer for knowledge graph construction and semantic search capabilities within the CopernicusAI Knowledge Engine.
228
+ </p>
229
+ </div>
230
+ </div>
231
+ </section>
232
+
233
+ <!-- Footer -->
234
+ <footer class="gradient-bg text-white py-8 mt-12">
235
+ <div class="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8 text-center">
236
+ <p class="text-lg font-semibold mb-2">Research Paper Metadata Database</p>
237
+ <p class="text-sm opacity-75">Part of the Copernicus AI Knowledge Engine</p>
238
+ <p class="text-xs opacity-50 mt-4">&copy; 2025 Gary Welz. All rights reserved.</p>
239
+ </div>
240
+ </footer>
241
+ </body>
242
  </html>
243
+