Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- .gitattributes +3 -0
- PROGRESS.md +16 -0
- __pycache__/app.cpython-312.pyc +2 -2
- app.py +216 -31
- database/collections/multi_seed_multi_seed_1758155685.json +3 -0
- database/collections/multi_seed_multi_seed_1758155809.json +3 -0
- database/collections/multi_seed_multi_seed_1758156063.json +3 -0
- database/collections/multi_seed_multi_seed_1758156344.json +0 -0
- database/collections/multi_seed_multi_seed_1758156664.json +0 -0
- database/filters/multi_seed_multi_seed_1758155809__filter__What_are_the_key_aspects_of_just_transit__20250918_122330.json +0 -0
- database/filters/multi_seed_multi_seed_1758155809__filter__What_are_the_key_aspects_of_just_transit__20250918_123000.json +0 -0
- database/filters/multi_seed_multi_seed_1758156664__filter__What_are_the_key_aspects_of_just_transit__20250918_025849.json +13 -0
- database/filters/multi_seed_multi_seed_1758156926__filter__What_are_the_key_aspects_of_just_transit__20250918_120128.json +13 -0
- database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_115925.json +13 -0
- database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_122404.json +0 -0
- database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_122747.json +0 -0
- database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_122824.json +0 -0
- requirements.txt +1 -0
- templates/index.html +459 -53
.gitattributes
CHANGED
|
@@ -37,3 +37,6 @@ ai_slr/__pycache__/app.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
|
| 37 |
ai_slr/__pycache__/app_backup.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
| 38 |
__pycache__/app.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
| 39 |
__pycache__/app_backup.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
ai_slr/__pycache__/app_backup.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
| 38 |
__pycache__/app.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
| 39 |
__pycache__/app_backup.cpython-312.pyc filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
database/collections/multi_seed_multi_seed_1758155685.json filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
database/collections/multi_seed_multi_seed_1758155809.json filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
database/collections/multi_seed_multi_seed_1758156063.json filter=lfs diff=lfs merge=lfs -text
|
PROGRESS.md
CHANGED
|
@@ -52,3 +52,19 @@
|
|
| 52 |
**Title:** Updated multi-seed collection results to show same detailed breakdown as regular collections
|
| 53 |
**Summary:** Enhanced progress display to show detailed seed-by-seed progress with current seed count and remaining seeds, updated completion messages to distinguish between "Collection" and "Multi-Seed Collection", added deduplication statistics display showing duplicates removed, and ensured multi-seed collections display the same comprehensive breakdown (cited + citing + related papers) as regular collections for consistency.
|
| 54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
**Title:** Updated multi-seed collection results to show same detailed breakdown as regular collections
|
| 53 |
**Summary:** Enhanced progress display to show detailed seed-by-seed progress with current seed count and remaining seeds, updated completion messages to distinguish between "Collection" and "Multi-Seed Collection", added deduplication statistics display showing duplicates removed, and ensured multi-seed collections display the same comprehensive breakdown (cited + citing + related papers) as regular collections for consistency.
|
| 54 |
|
| 55 |
+
## 2025-01-09 15:43:00 - Pre-GPT Filtering and Local Development Environment
|
| 56 |
+
**Title:** Added pre-filtering functionality and complete local development setup
|
| 57 |
+
**Summary:** Implemented pre-GPT filtering by publication date range and keyword search, added Step 2 for optional pre-filtering before GPT analysis, created complete local development environment with virtual environment setup, automated scripts for easy local running, environment configuration, and comprehensive documentation. Users can now filter papers by date/keywords before GPT analysis, and developers can run the app locally for testing and development.
|
| 58 |
+
|
| 59 |
+
## 2025-01-09 15:44:00 - Collection Loading Enhancements and UI Improvements
|
| 60 |
+
**Title:** Enhanced collection loading with visual indicators and improved seed paper display
|
| 61 |
+
**Summary:** Added glowing green dot loading indicator with pulsing animation when collections are loaded, shows "Loading collection..." then "Collection loaded successfully!" status, displays seed papers from multi-seed collections in the SELECTED SEED PAPERS box when collections are opened, removed automatic "just transitions" search suggestions on page load, and improved user experience with clear visual feedback during collection loading operations.
|
| 62 |
+
|
| 63 |
+
## 2025-01-09 15:45:00 - Detailed Seed Paper Information and Collection Stats
|
| 64 |
+
**Title:** Enhanced seed paper display with real author details and paper counts
|
| 65 |
+
**Summary:** Added get_paper_details function to fetch actual author names, publication years, and venues for seed papers, updated backend to store detailed seed information including paper counts (cited, citing, related), enhanced frontend to display real author details instead of placeholder text, added collection statistics box below seed papers showing total papers and breakdown, and improved seed paper display to show "Papers found: X (Y cited, Z citing, W related)" for each seed.
|
| 66 |
+
|
| 67 |
+
## 2025-01-09 15:46:00 - Smart Collection Display with Seed Information
|
| 68 |
+
**Title:** Enhanced collection history display with intelligent seed paper information
|
| 69 |
+
**Summary:** Updated collection display logic to show different formats based on collection type: single seed collections show the actual paper title, multi-seed collections show "Paper1, Paper2 + X others" format, merged collections show "Merged Collection" label, added hover tooltips showing full seed details for multi-seed collections, implemented async loading of collection details to display actual seed paper titles instead of generic labels, and enhanced user experience with clear visual distinction between collection types.
|
| 70 |
+
|
__pycache__/app.cpython-312.pyc
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7ec346baa18a6de961142ca6f27630bca2b861f655858529f619fe957fcee7ec
|
| 3 |
+
size 125221
|
app.py
CHANGED
|
@@ -105,6 +105,86 @@ def get_all_pages(url, headers, upper_limit=None):
|
|
| 105 |
return all_results
|
| 106 |
|
| 107 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
def get_related_papers(work_id, upper_limit=None, progress_callback=None):
|
| 109 |
# Define base URL for OpenAlex API
|
| 110 |
base_url = "https://api.openalex.org/works"
|
|
@@ -264,7 +344,27 @@ import time
|
|
| 264 |
|
| 265 |
def analyze_paper_relevance(content: Dict[str, str], research_question: str, api_key: str) -> Optional[Dict]:
|
| 266 |
"""Analyze if a paper is relevant to the research question using GPT-5 mini."""
|
| 267 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 268 |
|
| 269 |
title = content.get('title', '')
|
| 270 |
abstract = content.get('abstract', '')
|
|
@@ -314,11 +414,11 @@ def analyze_paper_relevance(content: Dict[str, str], research_question: str, api
|
|
| 314 |
# Try GPT-5 mini first, fallback to gpt-4o-mini if it fails
|
| 315 |
try:
|
| 316 |
print("DEBUG: Trying GPT-5 nano...")
|
| 317 |
-
response = client.
|
| 318 |
model="gpt-5-nano",
|
| 319 |
-
|
| 320 |
-
|
| 321 |
-
|
| 322 |
)
|
| 323 |
print("DEBUG: GPT-5 nano response received")
|
| 324 |
except Exception as e:
|
|
@@ -338,18 +438,9 @@ def analyze_paper_relevance(content: Dict[str, str], research_question: str, api
|
|
| 338 |
print(f"DEBUG: Response attributes: {dir(response)}")
|
| 339 |
|
| 340 |
if hasattr(response, 'choices') and response.choices:
|
| 341 |
-
#
|
| 342 |
print("DEBUG: Using chat completions format")
|
| 343 |
result = response.choices[0].message.content
|
| 344 |
-
elif hasattr(response, 'output'):
|
| 345 |
-
# New format (responses) - extract text from output
|
| 346 |
-
print("DEBUG: Using responses format")
|
| 347 |
-
result = ""
|
| 348 |
-
for item in response.output:
|
| 349 |
-
if hasattr(item, "content") and item.content:
|
| 350 |
-
for content in item.content:
|
| 351 |
-
if hasattr(content, "text") and content.text:
|
| 352 |
-
result += content.text
|
| 353 |
else:
|
| 354 |
print("DEBUG: Unexpected response format")
|
| 355 |
print(f"DEBUG: Response: {response}")
|
|
@@ -403,13 +494,43 @@ def extract_abstract_from_inverted_index(inverted_index: Dict) -> str:
|
|
| 403 |
def analyze_single_paper(paper: Dict, research_question: str, api_key: str) -> Optional[Dict]:
|
| 404 |
"""Analyze a single paper with its own client."""
|
| 405 |
try:
|
| 406 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 407 |
|
| 408 |
# Extract title and abstract
|
| 409 |
title = paper.get('title', '')
|
| 410 |
abstract = extract_abstract_from_inverted_index(paper.get('abstract_inverted_index', {}))
|
| 411 |
|
|
|
|
|
|
|
|
|
|
| 412 |
if not title and not abstract:
|
|
|
|
| 413 |
return None
|
| 414 |
|
| 415 |
# Create content for analysis
|
|
@@ -419,30 +540,39 @@ def analyze_single_paper(paper: Dict, research_question: str, api_key: str) -> O
|
|
| 419 |
}
|
| 420 |
|
| 421 |
# Analyze with GPT
|
|
|
|
| 422 |
analysis = analyze_paper_relevance_with_client(content, research_question, client)
|
|
|
|
| 423 |
|
| 424 |
if analysis:
|
| 425 |
paper['gpt_analysis'] = analysis
|
| 426 |
paper['relevance_reason'] = analysis.get('relevance_reason', 'Analysis completed')
|
| 427 |
paper['relevance_score'] = analysis.get('relevant', False)
|
|
|
|
| 428 |
return paper
|
| 429 |
|
|
|
|
| 430 |
return None
|
| 431 |
|
| 432 |
except Exception as e:
|
|
|
|
| 433 |
return None
|
| 434 |
|
| 435 |
def analyze_paper_batch(papers_batch: List[Dict], research_question: str, api_key: str, batch_id: int) -> List[Dict]:
|
| 436 |
"""Analyze a batch of papers in parallel using ThreadPoolExecutor."""
|
|
|
|
|
|
|
| 437 |
results = []
|
| 438 |
|
| 439 |
# Use ThreadPoolExecutor to process papers in parallel within the batch
|
| 440 |
with concurrent.futures.ThreadPoolExecutor(max_workers=len(papers_batch)) as executor:
|
|
|
|
| 441 |
# Submit all papers for parallel processing
|
| 442 |
future_to_paper = {
|
| 443 |
executor.submit(analyze_single_paper, paper, research_question, api_key): paper
|
| 444 |
for paper in papers_batch
|
| 445 |
}
|
|
|
|
| 446 |
|
| 447 |
# Collect results as they complete
|
| 448 |
for future in concurrent.futures.as_completed(future_to_paper):
|
|
@@ -486,15 +616,19 @@ def analyze_paper_relevance_with_client(content: Dict[str, str], research_questi
|
|
| 486 |
"""
|
| 487 |
|
| 488 |
try:
|
|
|
|
| 489 |
# Try GPT-5 nano first, fallback to gpt-4o-mini if it fails
|
| 490 |
try:
|
| 491 |
-
|
|
|
|
| 492 |
model="gpt-5-nano",
|
| 493 |
-
|
| 494 |
-
|
| 495 |
-
|
| 496 |
)
|
|
|
|
| 497 |
except Exception as e:
|
|
|
|
| 498 |
response = client.chat.completions.create(
|
| 499 |
model="gpt-4o-mini",
|
| 500 |
messages=[{
|
|
@@ -503,37 +637,45 @@ def analyze_paper_relevance_with_client(content: Dict[str, str], research_questi
|
|
| 503 |
}],
|
| 504 |
max_completion_tokens=1000
|
| 505 |
)
|
|
|
|
| 506 |
|
| 507 |
# Handle different response formats
|
| 508 |
result = None
|
|
|
|
|
|
|
|
|
|
| 509 |
if hasattr(response, 'choices') and response.choices:
|
| 510 |
-
#
|
|
|
|
| 511 |
result = response.choices[0].message.content
|
| 512 |
-
|
| 513 |
-
# New format (responses) - extract text from output
|
| 514 |
-
result = ""
|
| 515 |
-
for item in response.output:
|
| 516 |
-
if hasattr(item, "content") and item.content:
|
| 517 |
-
for content in item.content:
|
| 518 |
-
if hasattr(content, "text") and content.text:
|
| 519 |
-
result += content.text
|
| 520 |
else:
|
|
|
|
| 521 |
return None
|
| 522 |
|
| 523 |
if not result:
|
|
|
|
| 524 |
return None
|
| 525 |
|
| 526 |
# Clean and parse the JSON response
|
| 527 |
result = result.strip()
|
|
|
|
|
|
|
| 528 |
if result.startswith("```json"):
|
| 529 |
result = result[7:]
|
| 530 |
if result.endswith("```"):
|
| 531 |
result = result[:-3]
|
| 532 |
|
|
|
|
|
|
|
| 533 |
# Try to parse JSON
|
| 534 |
try:
|
| 535 |
-
|
| 536 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 537 |
return None
|
| 538 |
|
| 539 |
except Exception as e:
|
|
@@ -541,7 +683,13 @@ def analyze_paper_relevance_with_client(content: Dict[str, str], research_questi
|
|
| 541 |
|
| 542 |
def filter_papers_for_research_question(papers: List[Dict], research_question: str, api_key: str, limit: int = 10) -> List[Dict]:
|
| 543 |
"""Analyze exactly 'limit' number of papers for relevance using parallel processing."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 544 |
if not papers or not research_question:
|
|
|
|
| 545 |
return []
|
| 546 |
|
| 547 |
if not api_key:
|
|
@@ -557,13 +705,16 @@ def filter_papers_for_research_question(papers: List[Dict], research_question: s
|
|
| 557 |
|
| 558 |
# Process all papers in parallel (no batching needed for small numbers)
|
| 559 |
all_results = []
|
|
|
|
| 560 |
|
| 561 |
with concurrent.futures.ThreadPoolExecutor(max_workers=min(limit, 20)) as executor:
|
|
|
|
| 562 |
# Submit all papers for parallel processing
|
| 563 |
future_to_paper = {
|
| 564 |
executor.submit(analyze_single_paper, paper, research_question, api_key): paper
|
| 565 |
for paper in papers_to_analyze
|
| 566 |
}
|
|
|
|
| 567 |
|
| 568 |
# Collect results as they complete
|
| 569 |
for future in concurrent.futures.as_completed(future_to_paper):
|
|
@@ -1525,9 +1676,15 @@ def collect_multiple_seeds_async(seed_papers, limit, task_id):
|
|
| 1525 |
existing_paper['contributing_seeds'].append(i + 1)
|
| 1526 |
break
|
| 1527 |
|
|
|
|
|
|
|
|
|
|
| 1528 |
seed_results.append({
|
| 1529 |
'work_id': work_id,
|
| 1530 |
'title': seed_title,
|
|
|
|
|
|
|
|
|
|
| 1531 |
'papers_found': new_papers,
|
| 1532 |
'total_papers_from_seed': len(papers),
|
| 1533 |
'cited_papers': cited_count,
|
|
@@ -1588,11 +1745,23 @@ def collect_multiple_seeds_async(seed_papers, limit, task_id):
|
|
| 1588 |
with open(temp_path, 'w', encoding='utf-8') as f:
|
| 1589 |
json.dump(all_papers, f, indent=2, ensure_ascii=False)
|
| 1590 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1591 |
# Create combined collection data
|
| 1592 |
combined_title = f"Multi-Seed Collection ({total_seeds} seeds)"
|
| 1593 |
collection_data = {
|
| 1594 |
'work_id': f"multi_seed_{task_id}",
|
| 1595 |
'title': combined_title,
|
|
|
|
| 1596 |
'total_papers': len(all_papers),
|
| 1597 |
'cited_papers': final_cited_count,
|
| 1598 |
'citing_papers': final_citing_count,
|
|
@@ -1601,6 +1770,7 @@ def collect_multiple_seeds_async(seed_papers, limit, task_id):
|
|
| 1601 |
'papers': all_papers,
|
| 1602 |
'seed_results': seed_results,
|
| 1603 |
'total_seeds': total_seeds,
|
|
|
|
| 1604 |
'deduplication_stats': {
|
| 1605 |
'total_papers_before_dedup': total_papers_before_dedup,
|
| 1606 |
'duplicates_removed': duplicates_removed,
|
|
@@ -1765,6 +1935,7 @@ def collect_papers():
|
|
| 1765 |
def filter_papers():
|
| 1766 |
"""Filter papers based on research question."""
|
| 1767 |
try:
|
|
|
|
| 1768 |
data = request.get_json()
|
| 1769 |
research_question = data.get('research_question', '').strip()
|
| 1770 |
limit = data.get('limit', 10) # Default to 10 most recent relevant papers
|
|
@@ -1772,6 +1943,11 @@ def filter_papers():
|
|
| 1772 |
papers_data = data.get('papers') # Papers passed directly from frontend
|
| 1773 |
user_api_key = data.get('user_api_key') # User's own API key for large analyses
|
| 1774 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1775 |
if not research_question:
|
| 1776 |
return jsonify({'error': 'Research question is required'}), 400
|
| 1777 |
|
|
@@ -1793,6 +1969,8 @@ def filter_papers():
|
|
| 1793 |
|
| 1794 |
# Use user's API key if provided, otherwise use default
|
| 1795 |
api_key_to_use = user_api_key if user_api_key else OPENAI_API_KEY
|
|
|
|
|
|
|
| 1796 |
|
| 1797 |
if not api_key_to_use:
|
| 1798 |
return jsonify({
|
|
@@ -1800,8 +1978,15 @@ def filter_papers():
|
|
| 1800 |
}), 400
|
| 1801 |
|
| 1802 |
# Filter papers using custom analyzer (returns top N most recent relevant papers)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1803 |
relevant_papers = filter_papers_for_research_question(papers, research_question, api_key_to_use, limit)
|
| 1804 |
|
|
|
|
|
|
|
| 1805 |
# Determine source collection id for linkage
|
| 1806 |
source_collection_id = None
|
| 1807 |
if provided_source_collection:
|
|
|
|
| 105 |
return all_results
|
| 106 |
|
| 107 |
|
| 108 |
+
def get_paper_details(work_id):
|
| 109 |
+
"""Get detailed information about a specific paper."""
|
| 110 |
+
try:
|
| 111 |
+
work_url = f"https://api.openalex.org/works/{work_id}"
|
| 112 |
+
work_response = requests.get(work_url, timeout=30)
|
| 113 |
+
work_data = work_response.json()
|
| 114 |
+
|
| 115 |
+
if not work_data or 'id' not in work_data:
|
| 116 |
+
return {}
|
| 117 |
+
|
| 118 |
+
# Extract authors
|
| 119 |
+
authors = []
|
| 120 |
+
if work_data.get('authorships'):
|
| 121 |
+
for authorship in work_data['authorships']:
|
| 122 |
+
author = authorship.get('author', {})
|
| 123 |
+
if author.get('display_name'):
|
| 124 |
+
authors.append(author['display_name'])
|
| 125 |
+
|
| 126 |
+
# Extract venue
|
| 127 |
+
venue = ''
|
| 128 |
+
if work_data.get('primary_location', {}).get('source', {}).get('display_name'):
|
| 129 |
+
venue = work_data['primary_location']['source']['display_name']
|
| 130 |
+
|
| 131 |
+
return {
|
| 132 |
+
'authors': authors,
|
| 133 |
+
'publication_date': work_data.get('publication_date', ''),
|
| 134 |
+
'venue': venue,
|
| 135 |
+
'title': work_data.get('title', ''),
|
| 136 |
+
'id': work_data.get('id', '')
|
| 137 |
+
}
|
| 138 |
+
|
| 139 |
+
except Exception as e:
|
| 140 |
+
print(f"Error getting paper details for {work_id}: {e}")
|
| 141 |
+
return {}
|
| 142 |
+
|
| 143 |
+
def find_existing_single_seed_collection(work_id):
|
| 144 |
+
"""Check if we already have a single-seed collection for this work_id."""
|
| 145 |
+
try:
|
| 146 |
+
collections_dir = os.path.join(os.getcwd(), 'collections')
|
| 147 |
+
if not os.path.exists(collections_dir):
|
| 148 |
+
return None
|
| 149 |
+
|
| 150 |
+
# Look for files that might contain this work_id
|
| 151 |
+
for filename in os.listdir(collections_dir):
|
| 152 |
+
if filename.endswith('.pkl'):
|
| 153 |
+
file_path = os.path.join(collections_dir, filename)
|
| 154 |
+
try:
|
| 155 |
+
with open(file_path, 'rb') as f:
|
| 156 |
+
data = pickle.load(f)
|
| 157 |
+
|
| 158 |
+
# Check if this is a single-seed collection for this work_id
|
| 159 |
+
if (data.get('work_identifier') == work_id and
|
| 160 |
+
data.get('collection_type') == 'single_seed'):
|
| 161 |
+
print(f"Found existing single-seed collection for {work_id}: {filename}")
|
| 162 |
+
return file_path
|
| 163 |
+
except Exception as e:
|
| 164 |
+
print(f"Error reading {filename}: {e}")
|
| 165 |
+
continue
|
| 166 |
+
except Exception as e:
|
| 167 |
+
print(f"Error searching for existing collections: {e}")
|
| 168 |
+
|
| 169 |
+
return None
|
| 170 |
+
|
| 171 |
+
def load_existing_single_seed_collection(file_path):
|
| 172 |
+
"""Load an existing single-seed collection."""
|
| 173 |
+
try:
|
| 174 |
+
with open(file_path, 'rb') as f:
|
| 175 |
+
data = pickle.load(f)
|
| 176 |
+
|
| 177 |
+
# Extract papers and add relationship info
|
| 178 |
+
papers = data.get('papers', [])
|
| 179 |
+
for paper in papers:
|
| 180 |
+
if 'relationship' not in paper:
|
| 181 |
+
paper['relationship'] = 'unknown' # Default relationship
|
| 182 |
+
|
| 183 |
+
return papers
|
| 184 |
+
except Exception as e:
|
| 185 |
+
print(f"Error loading existing collection from {file_path}: {e}")
|
| 186 |
+
return []
|
| 187 |
+
|
| 188 |
def get_related_papers(work_id, upper_limit=None, progress_callback=None):
|
| 189 |
# Define base URL for OpenAlex API
|
| 190 |
base_url = "https://api.openalex.org/works"
|
|
|
|
| 344 |
|
| 345 |
def analyze_paper_relevance(content: Dict[str, str], research_question: str, api_key: str) -> Optional[Dict]:
|
| 346 |
"""Analyze if a paper is relevant to the research question using GPT-5 mini."""
|
| 347 |
+
print(f"DEBUG: Starting analyze_paper_relevance with API key length: {len(api_key) if api_key else 0}")
|
| 348 |
+
|
| 349 |
+
try:
|
| 350 |
+
print("DEBUG: Attempting to create OpenAI client...")
|
| 351 |
+
# Try to create client with minimal parameters to avoid proxies issue
|
| 352 |
+
client = OpenAI(api_key=api_key)
|
| 353 |
+
print("DEBUG: OpenAI client created successfully")
|
| 354 |
+
except Exception as e:
|
| 355 |
+
print(f"DEBUG: Error creating OpenAI client: {e}")
|
| 356 |
+
print(f"DEBUG: Error type: {type(e)}")
|
| 357 |
+
print(f"DEBUG: Error args: {e.args}")
|
| 358 |
+
# If there's any error with client creation, try with explicit parameters
|
| 359 |
+
try:
|
| 360 |
+
print("DEBUG: Trying alternative client creation with timeout...")
|
| 361 |
+
client = OpenAI(api_key=api_key, timeout=30.0)
|
| 362 |
+
print("DEBUG: Alternative OpenAI client created successfully")
|
| 363 |
+
except Exception as e2:
|
| 364 |
+
print(f"DEBUG: Failed to create OpenAI client with alternative method: {e2}")
|
| 365 |
+
print(f"DEBUG: Alternative error type: {type(e2)}")
|
| 366 |
+
print(f"DEBUG: Alternative error args: {e2.args}")
|
| 367 |
+
return None
|
| 368 |
|
| 369 |
title = content.get('title', '')
|
| 370 |
abstract = content.get('abstract', '')
|
|
|
|
| 414 |
# Try GPT-5 mini first, fallback to gpt-4o-mini if it fails
|
| 415 |
try:
|
| 416 |
print("DEBUG: Trying GPT-5 nano...")
|
| 417 |
+
response = client.chat.completions.create(
|
| 418 |
model="gpt-5-nano",
|
| 419 |
+
messages=[{"role": "user", "content": prompt}],
|
| 420 |
+
max_tokens=1000,
|
| 421 |
+
temperature=0.1
|
| 422 |
)
|
| 423 |
print("DEBUG: GPT-5 nano response received")
|
| 424 |
except Exception as e:
|
|
|
|
| 438 |
print(f"DEBUG: Response attributes: {dir(response)}")
|
| 439 |
|
| 440 |
if hasattr(response, 'choices') and response.choices:
|
| 441 |
+
# Standard chat completions format
|
| 442 |
print("DEBUG: Using chat completions format")
|
| 443 |
result = response.choices[0].message.content
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 444 |
else:
|
| 445 |
print("DEBUG: Unexpected response format")
|
| 446 |
print(f"DEBUG: Response: {response}")
|
|
|
|
| 494 |
def analyze_single_paper(paper: Dict, research_question: str, api_key: str) -> Optional[Dict]:
|
| 495 |
"""Analyze a single paper with its own client."""
|
| 496 |
try:
|
| 497 |
+
print(f"DEBUG: Starting analysis for paper: {paper.get('title', 'No title')[:50]}...")
|
| 498 |
+
print(f"DEBUG: API key length: {len(api_key) if api_key else 0}")
|
| 499 |
+
|
| 500 |
+
try:
|
| 501 |
+
print("DEBUG: Attempting to create OpenAI client in analyze_single_paper...")
|
| 502 |
+
client = OpenAI(api_key=api_key)
|
| 503 |
+
print("DEBUG: OpenAI client created successfully in analyze_single_paper")
|
| 504 |
+
except TypeError as e:
|
| 505 |
+
print(f"DEBUG: TypeError in analyze_single_paper: {e}")
|
| 506 |
+
print(f"DEBUG: Error type: {type(e)}")
|
| 507 |
+
print(f"DEBUG: Error args: {e.args}")
|
| 508 |
+
if 'proxies' in str(e):
|
| 509 |
+
print(f"DEBUG: Caught proxies error, trying alternative client creation...")
|
| 510 |
+
# Remove any proxies parameter and try again
|
| 511 |
+
try:
|
| 512 |
+
client = OpenAI(api_key=api_key)
|
| 513 |
+
print("DEBUG: Alternative client creation successful")
|
| 514 |
+
except Exception as e2:
|
| 515 |
+
print(f"DEBUG: Alternative client creation failed: {e2}")
|
| 516 |
+
raise
|
| 517 |
+
else:
|
| 518 |
+
raise
|
| 519 |
+
except Exception as e:
|
| 520 |
+
print(f"DEBUG: Unexpected error in analyze_single_paper client creation: {e}")
|
| 521 |
+
print(f"DEBUG: Error type: {type(e)}")
|
| 522 |
+
print(f"DEBUG: Error args: {e.args}")
|
| 523 |
+
raise
|
| 524 |
|
| 525 |
# Extract title and abstract
|
| 526 |
title = paper.get('title', '')
|
| 527 |
abstract = extract_abstract_from_inverted_index(paper.get('abstract_inverted_index', {}))
|
| 528 |
|
| 529 |
+
print(f"DEBUG: Title: {title[:50]}...")
|
| 530 |
+
print(f"DEBUG: Abstract length: {len(abstract)}")
|
| 531 |
+
|
| 532 |
if not title and not abstract:
|
| 533 |
+
print("DEBUG: No title or abstract, skipping paper")
|
| 534 |
return None
|
| 535 |
|
| 536 |
# Create content for analysis
|
|
|
|
| 540 |
}
|
| 541 |
|
| 542 |
# Analyze with GPT
|
| 543 |
+
print(f"DEBUG: Calling analyze_paper_relevance_with_client...")
|
| 544 |
analysis = analyze_paper_relevance_with_client(content, research_question, client)
|
| 545 |
+
print(f"DEBUG: Analysis result: {analysis}")
|
| 546 |
|
| 547 |
if analysis:
|
| 548 |
paper['gpt_analysis'] = analysis
|
| 549 |
paper['relevance_reason'] = analysis.get('relevance_reason', 'Analysis completed')
|
| 550 |
paper['relevance_score'] = analysis.get('relevant', False)
|
| 551 |
+
print(f"DEBUG: Paper marked as relevant: {analysis.get('relevant', False)}")
|
| 552 |
return paper
|
| 553 |
|
| 554 |
+
print("DEBUG: No analysis returned, skipping paper")
|
| 555 |
return None
|
| 556 |
|
| 557 |
except Exception as e:
|
| 558 |
+
print(f"DEBUG: Exception in analyze_single_paper: {e}")
|
| 559 |
return None
|
| 560 |
|
| 561 |
def analyze_paper_batch(papers_batch: List[Dict], research_question: str, api_key: str, batch_id: int) -> List[Dict]:
|
| 562 |
"""Analyze a batch of papers in parallel using ThreadPoolExecutor."""
|
| 563 |
+
print(f"DEBUG: Starting analyze_paper_batch with {len(papers_batch)} papers, batch_id: {batch_id}")
|
| 564 |
+
print(f"DEBUG: API key length: {len(api_key) if api_key else 0}")
|
| 565 |
results = []
|
| 566 |
|
| 567 |
# Use ThreadPoolExecutor to process papers in parallel within the batch
|
| 568 |
with concurrent.futures.ThreadPoolExecutor(max_workers=len(papers_batch)) as executor:
|
| 569 |
+
print(f"DEBUG: Created ThreadPoolExecutor with {len(papers_batch)} workers")
|
| 570 |
# Submit all papers for parallel processing
|
| 571 |
future_to_paper = {
|
| 572 |
executor.submit(analyze_single_paper, paper, research_question, api_key): paper
|
| 573 |
for paper in papers_batch
|
| 574 |
}
|
| 575 |
+
print(f"DEBUG: Submitted {len(future_to_paper)} papers for parallel processing")
|
| 576 |
|
| 577 |
# Collect results as they complete
|
| 578 |
for future in concurrent.futures.as_completed(future_to_paper):
|
|
|
|
| 616 |
"""
|
| 617 |
|
| 618 |
try:
|
| 619 |
+
print(f"DEBUG: Making API call for paper: {title[:30]}...")
|
| 620 |
# Try GPT-5 nano first, fallback to gpt-4o-mini if it fails
|
| 621 |
try:
|
| 622 |
+
print("DEBUG: Trying GPT-5 nano...")
|
| 623 |
+
response = client.chat.completions.create(
|
| 624 |
model="gpt-5-nano",
|
| 625 |
+
messages=[{"role": "user", "content": prompt}],
|
| 626 |
+
max_tokens=1000,
|
| 627 |
+
temperature=0.1
|
| 628 |
)
|
| 629 |
+
print("DEBUG: GPT-5 nano response received")
|
| 630 |
except Exception as e:
|
| 631 |
+
print(f"DEBUG: GPT-5 nano failed: {e}, trying gpt-4o-mini...")
|
| 632 |
response = client.chat.completions.create(
|
| 633 |
model="gpt-4o-mini",
|
| 634 |
messages=[{
|
|
|
|
| 637 |
}],
|
| 638 |
max_completion_tokens=1000
|
| 639 |
)
|
| 640 |
+
print("DEBUG: gpt-4o-mini response received")
|
| 641 |
|
| 642 |
# Handle different response formats
|
| 643 |
result = None
|
| 644 |
+
print(f"DEBUG: Response type: {type(response)}")
|
| 645 |
+
print(f"DEBUG: Response attributes: {dir(response)}")
|
| 646 |
+
|
| 647 |
if hasattr(response, 'choices') and response.choices:
|
| 648 |
+
# Standard chat completions format
|
| 649 |
+
print("DEBUG: Using chat completions format")
|
| 650 |
result = response.choices[0].message.content
|
| 651 |
+
print(f"DEBUG: Chat completions result: {result[:100]}...")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 652 |
else:
|
| 653 |
+
print("DEBUG: Unknown response format, returning None")
|
| 654 |
return None
|
| 655 |
|
| 656 |
if not result:
|
| 657 |
+
print("DEBUG: No result extracted, returning None")
|
| 658 |
return None
|
| 659 |
|
| 660 |
# Clean and parse the JSON response
|
| 661 |
result = result.strip()
|
| 662 |
+
print(f"DEBUG: Raw result: {result[:200]}...")
|
| 663 |
+
|
| 664 |
if result.startswith("```json"):
|
| 665 |
result = result[7:]
|
| 666 |
if result.endswith("```"):
|
| 667 |
result = result[:-3]
|
| 668 |
|
| 669 |
+
print(f"DEBUG: Cleaned result: {result[:200]}...")
|
| 670 |
+
|
| 671 |
# Try to parse JSON
|
| 672 |
try:
|
| 673 |
+
parsed = json.loads(result.strip())
|
| 674 |
+
print(f"DEBUG: Successfully parsed JSON: {parsed}")
|
| 675 |
+
return parsed
|
| 676 |
+
except json.JSONDecodeError as e:
|
| 677 |
+
print(f"DEBUG: JSON parsing failed: {e}")
|
| 678 |
+
print(f"DEBUG: Failed to parse: {result}")
|
| 679 |
return None
|
| 680 |
|
| 681 |
except Exception as e:
|
|
|
|
| 683 |
|
| 684 |
def filter_papers_for_research_question(papers: List[Dict], research_question: str, api_key: str, limit: int = 10) -> List[Dict]:
|
| 685 |
"""Analyze exactly 'limit' number of papers for relevance using parallel processing."""
|
| 686 |
+
print(f"DEBUG: filter_papers_for_research_question called with {len(papers) if papers else 0} papers")
|
| 687 |
+
print(f"DEBUG: Research question: {research_question}")
|
| 688 |
+
print(f"DEBUG: Limit: {limit}")
|
| 689 |
+
print(f"DEBUG: API key length: {len(api_key) if api_key else 0}")
|
| 690 |
+
|
| 691 |
if not papers or not research_question:
|
| 692 |
+
print("DEBUG: No papers or research question provided")
|
| 693 |
return []
|
| 694 |
|
| 695 |
if not api_key:
|
|
|
|
| 705 |
|
| 706 |
# Process all papers in parallel (no batching needed for small numbers)
|
| 707 |
all_results = []
|
| 708 |
+
print(f"DEBUG: Processing {len(papers_to_analyze)} papers in parallel")
|
| 709 |
|
| 710 |
with concurrent.futures.ThreadPoolExecutor(max_workers=min(limit, 20)) as executor:
|
| 711 |
+
print(f"DEBUG: Created ThreadPoolExecutor with max_workers={min(limit, 20)}")
|
| 712 |
# Submit all papers for parallel processing
|
| 713 |
future_to_paper = {
|
| 714 |
executor.submit(analyze_single_paper, paper, research_question, api_key): paper
|
| 715 |
for paper in papers_to_analyze
|
| 716 |
}
|
| 717 |
+
print(f"DEBUG: Submitted {len(future_to_paper)} papers for parallel processing")
|
| 718 |
|
| 719 |
# Collect results as they complete
|
| 720 |
for future in concurrent.futures.as_completed(future_to_paper):
|
|
|
|
| 1676 |
existing_paper['contributing_seeds'].append(i + 1)
|
| 1677 |
break
|
| 1678 |
|
| 1679 |
+
# Get detailed seed paper information
|
| 1680 |
+
seed_paper_details = get_paper_details(work_id)
|
| 1681 |
+
|
| 1682 |
seed_results.append({
|
| 1683 |
'work_id': work_id,
|
| 1684 |
'title': seed_title,
|
| 1685 |
+
'authors': seed_paper_details.get('authors', []),
|
| 1686 |
+
'year': seed_paper_details.get('publication_date', '').split('-')[0] if seed_paper_details.get('publication_date') else '',
|
| 1687 |
+
'venue': seed_paper_details.get('venue', ''),
|
| 1688 |
'papers_found': new_papers,
|
| 1689 |
'total_papers_from_seed': len(papers),
|
| 1690 |
'cited_papers': cited_count,
|
|
|
|
| 1745 |
with open(temp_path, 'w', encoding='utf-8') as f:
|
| 1746 |
json.dump(all_papers, f, indent=2, ensure_ascii=False)
|
| 1747 |
|
| 1748 |
+
# Create display title based on seed count
|
| 1749 |
+
if total_seeds == 1:
|
| 1750 |
+
display_title = seed_results[0]['title'] if seed_results else "Single Seed Collection"
|
| 1751 |
+
elif total_seeds == 2:
|
| 1752 |
+
display_title = f"{seed_results[0]['title']} & {seed_results[1]['title']}"
|
| 1753 |
+
else:
|
| 1754 |
+
# Show first 2 + count for multiple seeds
|
| 1755 |
+
first_two = f"{seed_results[0]['title']}, {seed_results[1]['title']}"
|
| 1756 |
+
remaining = total_seeds - 2
|
| 1757 |
+
display_title = f"{first_two} + {remaining} others"
|
| 1758 |
+
|
| 1759 |
# Create combined collection data
|
| 1760 |
combined_title = f"Multi-Seed Collection ({total_seeds} seeds)"
|
| 1761 |
collection_data = {
|
| 1762 |
'work_id': f"multi_seed_{task_id}",
|
| 1763 |
'title': combined_title,
|
| 1764 |
+
'display_title': display_title, # Add display title for immediate use
|
| 1765 |
'total_papers': len(all_papers),
|
| 1766 |
'cited_papers': final_cited_count,
|
| 1767 |
'citing_papers': final_citing_count,
|
|
|
|
| 1770 |
'papers': all_papers,
|
| 1771 |
'seed_results': seed_results,
|
| 1772 |
'total_seeds': total_seeds,
|
| 1773 |
+
'collection_type': 'multiseed',
|
| 1774 |
'deduplication_stats': {
|
| 1775 |
'total_papers_before_dedup': total_papers_before_dedup,
|
| 1776 |
'duplicates_removed': duplicates_removed,
|
|
|
|
| 1935 |
def filter_papers():
|
| 1936 |
"""Filter papers based on research question."""
|
| 1937 |
try:
|
| 1938 |
+
print("DEBUG: Starting filter_papers endpoint")
|
| 1939 |
data = request.get_json()
|
| 1940 |
research_question = data.get('research_question', '').strip()
|
| 1941 |
limit = data.get('limit', 10) # Default to 10 most recent relevant papers
|
|
|
|
| 1943 |
papers_data = data.get('papers') # Papers passed directly from frontend
|
| 1944 |
user_api_key = data.get('user_api_key') # User's own API key for large analyses
|
| 1945 |
|
| 1946 |
+
print(f"DEBUG: Research question: {research_question}")
|
| 1947 |
+
print(f"DEBUG: Limit: {limit}")
|
| 1948 |
+
print(f"DEBUG: User API key provided: {bool(user_api_key)}")
|
| 1949 |
+
print(f"DEBUG: Papers data provided: {bool(papers_data)}")
|
| 1950 |
+
|
| 1951 |
if not research_question:
|
| 1952 |
return jsonify({'error': 'Research question is required'}), 400
|
| 1953 |
|
|
|
|
| 1969 |
|
| 1970 |
# Use user's API key if provided, otherwise use default
|
| 1971 |
api_key_to_use = user_api_key if user_api_key else OPENAI_API_KEY
|
| 1972 |
+
print(f"DEBUG: Using API key length: {len(api_key_to_use) if api_key_to_use else 0}")
|
| 1973 |
+
print(f"DEBUG: API key source: {'user provided' if user_api_key else 'default'}")
|
| 1974 |
|
| 1975 |
if not api_key_to_use:
|
| 1976 |
return jsonify({
|
|
|
|
| 1978 |
}), 400
|
| 1979 |
|
| 1980 |
# Filter papers using custom analyzer (returns top N most recent relevant papers)
|
| 1981 |
+
print(f"DEBUG: About to call filter_papers_for_research_question with {len(papers)} papers")
|
| 1982 |
+
print(f"DEBUG: Research question: {research_question}")
|
| 1983 |
+
print(f"DEBUG: Limit: {limit}")
|
| 1984 |
+
print(f"DEBUG: API key length: {len(api_key_to_use) if api_key_to_use else 0}")
|
| 1985 |
+
|
| 1986 |
relevant_papers = filter_papers_for_research_question(papers, research_question, api_key_to_use, limit)
|
| 1987 |
|
| 1988 |
+
print(f"DEBUG: filter_papers_for_research_question returned {len(relevant_papers) if relevant_papers else 0} papers")
|
| 1989 |
+
|
| 1990 |
# Determine source collection id for linkage
|
| 1991 |
source_collection_id = None
|
| 1992 |
if provided_source_collection:
|
database/collections/multi_seed_multi_seed_1758155685.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e55986181ab7bec02034998c2da445afc5bd113632cf23b6dcd4a8511e1482b9
|
| 3 |
+
size 47616821
|
database/collections/multi_seed_multi_seed_1758155809.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5c532804aeb3b99f06b7cd3650330780905bcc46c4c1406881c1738a56b079ea
|
| 3 |
+
size 57878366
|
database/collections/multi_seed_multi_seed_1758156063.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7b2a24846b5af508f38a1937c914161729f4f6184cea41f836d804d9219c54a6
|
| 3 |
+
size 12273368
|
database/collections/multi_seed_multi_seed_1758156344.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
database/collections/multi_seed_multi_seed_1758156664.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
database/filters/multi_seed_multi_seed_1758155809__filter__What_are_the_key_aspects_of_just_transit__20250918_122330.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
database/filters/multi_seed_multi_seed_1758155809__filter__What_are_the_key_aspects_of_just_transit__20250918_123000.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
database/filters/multi_seed_multi_seed_1758156664__filter__What_are_the_key_aspects_of_just_transit__20250918_025849.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"research_question": "What are the key aspects of just transitions in climate policy and energy systems?",
|
| 3 |
+
"total_papers": 68,
|
| 4 |
+
"tested_papers": 10,
|
| 5 |
+
"relevant_papers": 0,
|
| 6 |
+
"oa_percentage": 43,
|
| 7 |
+
"abstract_percentage": 62,
|
| 8 |
+
"limit": 10,
|
| 9 |
+
"papers": [],
|
| 10 |
+
"source_collection": "multi_seed_multi_seed_1758156664",
|
| 11 |
+
"filter_identifier": "multi_seed_multi_seed_1758156664__filter__What_are_the_key_aspects_of_just_transit__20250918_025849",
|
| 12 |
+
"created": "2025-09-18T02:58:49.151898"
|
| 13 |
+
}
|
database/filters/multi_seed_multi_seed_1758156926__filter__What_are_the_key_aspects_of_just_transit__20250918_120128.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"research_question": "What are the key aspects of just transitions in climate policy and energy systems?",
|
| 3 |
+
"total_papers": 14,
|
| 4 |
+
"tested_papers": 10,
|
| 5 |
+
"relevant_papers": 0,
|
| 6 |
+
"oa_percentage": 43,
|
| 7 |
+
"abstract_percentage": 71,
|
| 8 |
+
"limit": 10,
|
| 9 |
+
"papers": [],
|
| 10 |
+
"source_collection": "multi_seed_multi_seed_1758156926",
|
| 11 |
+
"filter_identifier": "multi_seed_multi_seed_1758156926__filter__What_are_the_key_aspects_of_just_transit__20250918_120128",
|
| 12 |
+
"created": "2025-09-18T12:01:28.210811"
|
| 13 |
+
}
|
database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_115925.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"research_question": "What are the key aspects of just transitions in climate policy and energy systems?",
|
| 3 |
+
"total_papers": 346,
|
| 4 |
+
"tested_papers": 10,
|
| 5 |
+
"relevant_papers": 0,
|
| 6 |
+
"oa_percentage": 58,
|
| 7 |
+
"abstract_percentage": 64,
|
| 8 |
+
"limit": 10,
|
| 9 |
+
"papers": [],
|
| 10 |
+
"source_collection": "multi_seed_multi_seed_1758157170",
|
| 11 |
+
"filter_identifier": "multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_115925",
|
| 12 |
+
"created": "2025-09-18T11:59:25.053828"
|
| 13 |
+
}
|
database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_122404.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_122747.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
database/filters/multi_seed_multi_seed_1758157170__filter__What_are_the_key_aspects_of_just_transit__20250918_122824.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
requirements.txt
CHANGED
|
@@ -3,6 +3,7 @@ Flask-CORS==4.0.1
|
|
| 3 |
gunicorn==21.2.0
|
| 4 |
requests==2.32.3
|
| 5 |
openai==1.54.4
|
|
|
|
| 6 |
pandas==2.2.3
|
| 7 |
tqdm==4.66.5
|
| 8 |
openpyxl==3.1.5
|
|
|
|
| 3 |
gunicorn==21.2.0
|
| 4 |
requests==2.32.3
|
| 5 |
openai==1.54.4
|
| 6 |
+
httpx==0.27.2
|
| 7 |
pandas==2.2.3
|
| 8 |
tqdm==4.66.5
|
| 9 |
openpyxl==3.1.5
|
templates/index.html
CHANGED
|
@@ -24,12 +24,12 @@
|
|
| 24 |
|
| 25 |
.main-content {
|
| 26 |
flex: 1;
|
| 27 |
-
max-width:
|
| 28 |
margin-right: 15px;
|
| 29 |
}
|
| 30 |
|
| 31 |
.history-panel {
|
| 32 |
-
width:
|
| 33 |
border: 3px solid #ffffff;
|
| 34 |
padding: 15px;
|
| 35 |
background: #000000;
|
|
@@ -40,7 +40,7 @@
|
|
| 40 |
}
|
| 41 |
|
| 42 |
.merge-panel {
|
| 43 |
-
width:
|
| 44 |
border: 3px solid #ffffff;
|
| 45 |
padding: 15px;
|
| 46 |
background: #000000;
|
|
@@ -90,7 +90,7 @@
|
|
| 90 |
}
|
| 91 |
|
| 92 |
.filters-panel {
|
| 93 |
-
width:
|
| 94 |
border: 3px solid #ffffff;
|
| 95 |
padding: 15px;
|
| 96 |
background: #000000;
|
|
@@ -514,7 +514,7 @@
|
|
| 514 |
|
| 515 |
<div id="titleInput">
|
| 516 |
<p>Enter a paper title to search for and collect related papers.</p>
|
| 517 |
-
<input type="text" id="paperTitle" placeholder="Enter paper title..." value="
|
| 518 |
<button onclick="searchPapers()" id="searchBtn" style="margin-left: 10px;">Search Papers</button>
|
| 519 |
<div id="paperMatches" style="display: none; margin-top: 15px;"></div>
|
| 520 |
|
|
@@ -527,10 +527,18 @@
|
|
| 527 |
</h4>
|
| 528 |
<div id="selectedSeeds" style="min-height: 60px; border: 1px dashed #666666; padding: 10px; background: #000000;">
|
| 529 |
<div style="color: #666666; text-align: center; font-size: 0.8em;">No seed papers selected. Search and click papers to add them.</div>
|
| 530 |
-
|
| 531 |
<div style="margin-top: 10px;">
|
| 532 |
<button onclick="clearAllSeeds()" style="background: #333333; color: #ffffff; border: 1px solid #666666; padding: 5px 10px; font-size: 10px; margin-right: 10px;">Clear All</button>
|
| 533 |
<span style="font-size: 0.8em; color: #aaaaaa;">Click papers above to add/remove them from collection</span>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 534 |
</div>
|
| 535 |
</div>
|
| 536 |
</div>
|
|
@@ -542,9 +550,36 @@
|
|
| 542 |
</div>
|
| 543 |
</div>
|
| 544 |
|
| 545 |
-
<!-- Step 2: Filter Papers -->
|
| 546 |
<div class="section">
|
| 547 |
-
<h2>Step 2: Filter
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 548 |
<p>Enter your research question to filter the collected papers for relevance.</p>
|
| 549 |
<textarea id="researchQuestion" rows="3" placeholder="What are the main impacts of climate change on ocean circulation patterns?">What are the key aspects of just transitions in climate policy and energy systems?</textarea>
|
| 550 |
<div style="margin: 10px 0;">
|
|
@@ -607,17 +642,15 @@
|
|
| 607 |
let collectedPapers = [];
|
| 608 |
let lastDisplayedPapers = [];
|
| 609 |
let selectedSeeds = []; // Array to store multiple selected seed papers
|
|
|
|
|
|
|
| 610 |
|
| 611 |
// Set default values when page loads
|
| 612 |
document.addEventListener('DOMContentLoaded', function() {
|
| 613 |
document.getElementById('researchQuestion').value = 'What are the key aspects of just transitions in climate policy and energy systems?';
|
| 614 |
loadHistory();
|
| 615 |
|
| 616 |
-
//
|
| 617 |
-
const paperTitle = document.getElementById('paperTitle').value.trim();
|
| 618 |
-
if (paperTitle) {
|
| 619 |
-
searchPapers();
|
| 620 |
-
}
|
| 621 |
});
|
| 622 |
|
| 623 |
let currentCollectionFile = null;
|
|
@@ -674,20 +707,20 @@
|
|
| 674 |
function displayPaperMatches(matches) {
|
| 675 |
const matchesDiv = document.getElementById('paperMatches');
|
| 676 |
matchesDiv.innerHTML = `
|
| 677 |
-
<h4 style="color: #ffffff; margin-bottom:
|
| 678 |
${matches.map((match, index) => `
|
| 679 |
<div class="paper-match" data-work-id="${match.work_id}" onclick="selectPaper('${match.work_id}', this)" style="
|
| 680 |
border: 2px solid #ffffff;
|
| 681 |
-
padding:
|
| 682 |
-
margin-bottom:
|
| 683 |
cursor: pointer;
|
| 684 |
background: #000000;
|
| 685 |
transition: all 0.2s ease;
|
| 686 |
">
|
| 687 |
-
<div style="font-weight: bold; color: #ffffff; margin-bottom:
|
| 688 |
-
<div style="font-size: 0.
|
| 689 |
-
<div style="font-size: 0.
|
| 690 |
-
<div style="font-size: 0.
|
| 691 |
</div>
|
| 692 |
`).join('')}
|
| 693 |
`;
|
|
@@ -718,9 +751,10 @@
|
|
| 718 |
venue: element.querySelectorAll('div')[2].textContent.split(' | ')[1].replace('Venue: ', '')
|
| 719 |
};
|
| 720 |
selectedSeeds.push(paperData);
|
| 721 |
-
element.style.background = '#
|
| 722 |
-
element.style.color = '#
|
| 723 |
element.style.borderColor = '#ffffff';
|
|
|
|
| 724 |
}
|
| 725 |
|
| 726 |
updateSeedCollectionDisplay();
|
|
@@ -736,15 +770,31 @@
|
|
| 736 |
if (selectedSeeds.length === 0) {
|
| 737 |
selectedSeedsDiv.innerHTML = '<div style="color: #666666; text-align: center; font-size: 0.8em;">No seed papers selected. Search and click papers to add them.</div>';
|
| 738 |
} else {
|
| 739 |
-
selectedSeedsDiv.innerHTML = selectedSeeds.map((seed, index) =>
|
| 740 |
-
|
| 741 |
-
|
| 742 |
-
|
| 743 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 744 |
</div>
|
| 745 |
-
|
| 746 |
-
|
| 747 |
-
`).join('');
|
| 748 |
}
|
| 749 |
}
|
| 750 |
|
|
@@ -757,8 +807,17 @@
|
|
| 757 |
document.querySelectorAll('.paper-match').forEach(match => {
|
| 758 |
const workId = match.getAttribute('data-work-id');
|
| 759 |
const isSelected = selectedSeeds.some(seed => seed.work_id === workId);
|
| 760 |
-
|
| 761 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 762 |
});
|
| 763 |
}
|
| 764 |
|
|
@@ -783,16 +842,146 @@
|
|
| 783 |
alert('SEED PAPERS INFO:\n\n• Search for papers by title\n• Click papers to add them to your seed collection\n• Each selected paper will be used to find related papers (cited, citing, and related works)\n• You can select multiple seed papers for a comprehensive collection\n• Click the × button to remove papers from your selection');
|
| 784 |
}
|
| 785 |
|
| 786 |
-
|
| 787 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 788 |
|
| 789 |
-
|
| 790 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 791 |
return;
|
| 792 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 793 |
if (selectedSeeds.length === 0) {
|
| 794 |
showStatus('collectStatus', 'Please search and select at least one paper first', 'error');
|
| 795 |
-
|
| 796 |
}
|
| 797 |
|
| 798 |
const collectBtn = document.getElementById('collectBtn');
|
|
@@ -871,7 +1060,7 @@
|
|
| 871 |
|
| 872 |
// Show appropriate completion message
|
| 873 |
if (result && result.total_seeds && result.total_seeds > 1) {
|
| 874 |
-
progressMessage.textContent = `Multi-Seed
|
| 875 |
} else {
|
| 876 |
progressMessage.textContent = 'Collection completed!';
|
| 877 |
}
|
|
@@ -886,7 +1075,7 @@
|
|
| 886 |
let breakdown = `${result.cited_papers} cited + ${result.citing_papers} citing + ${result.related_papers} related`;
|
| 887 |
|
| 888 |
if (result.total_seeds && result.total_seeds > 1) {
|
| 889 |
-
collectionType = 'Multi-Seed
|
| 890 |
const dedupStats = result.deduplication_stats;
|
| 891 |
if (dedupStats) {
|
| 892 |
breakdown += ` (${dedupStats.duplicates_removed} duplicates removed)`;
|
|
@@ -895,12 +1084,26 @@
|
|
| 895 |
|
| 896 |
showStatus('collectStatus', `Successfully completed ${collectionType} - ${result.total_papers} papers (${breakdown})`, 'success');
|
| 897 |
document.getElementById('filterBtn').disabled = false;
|
|
|
|
| 898 |
document.getElementById('resultsSection').style.display = 'block';
|
| 899 |
updateStats(result.total_papers, 0, result.cited_papers, result.citing_papers, result.related_papers);
|
| 900 |
currentCollectionFile = result.db_filename || null;
|
| 901 |
historyIndex.currentCollectionId = result.work_id ? (result.work_id.replace('https://api.openalex.org/works/','').replace('https://openalex.org/','')) : null;
|
| 902 |
document.getElementById('collectDownload').style.display = currentCollectionFile ? 'block' : 'none';
|
| 903 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 904 |
// Reset button
|
| 905 |
document.getElementById('collectBtn').disabled = false;
|
| 906 |
document.getElementById('collectBtn').textContent = 'Collect Papers';
|
|
@@ -929,14 +1132,14 @@
|
|
| 929 |
progressFill.style.width = `${progressPercent}%`;
|
| 930 |
progressText.textContent = `${Math.round(progressPercent)}%`;
|
| 931 |
|
| 932 |
-
// Show detailed progress for
|
| 933 |
if (progress.total_seeds && progress.total_seeds > 1) {
|
| 934 |
const currentSeed = progress.current_seed || 0;
|
| 935 |
const remainingSeeds = progress.remaining_seeds || 0;
|
| 936 |
const totalSeeds = progress.total_seeds || 0;
|
| 937 |
progressMessage.textContent = `${progress.message || 'Processing...'} (Seed ${currentSeed}/${totalSeeds}, ${remainingSeeds} remaining)`;
|
| 938 |
} else {
|
| 939 |
-
|
| 940 |
}
|
| 941 |
}
|
| 942 |
} catch (error) {
|
|
@@ -954,6 +1157,14 @@
|
|
| 954 |
return;
|
| 955 |
}
|
| 956 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 957 |
// Check if user wants to analyze more than 50 papers
|
| 958 |
if (paperLimit > 50) {
|
| 959 |
const userApiKey = prompt(`You want to analyze ${paperLimit} papers, which exceeds the limit of 50.\n\nPlease provide your own OpenAI API key to continue:\n\n(Your API key will be used only for this analysis and not stored)`);
|
|
@@ -995,7 +1206,7 @@
|
|
| 995 |
research_question: researchQuestion,
|
| 996 |
limit: paperLimit,
|
| 997 |
source_collection: historyIndex.currentCollectionId || null,
|
| 998 |
-
papers:
|
| 999 |
user_api_key: window.userApiKey || null
|
| 1000 |
})
|
| 1001 |
});
|
|
@@ -1003,6 +1214,10 @@
|
|
| 1003 |
const data = await response.json();
|
| 1004 |
|
| 1005 |
if (data.success) {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1006 |
// Simulate progress for filtering (since it's synchronous in backend)
|
| 1007 |
let progress = 0;
|
| 1008 |
const progressInterval = setInterval(() => {
|
|
@@ -1065,29 +1280,33 @@
|
|
| 1065 |
function updateStats(total, relevant, cited = 0, citing = 0, related = 0, relevantAbs = null, totalAbs = null, tested = null, oaPercentage = null, abstractPercentage = null) {
|
| 1066 |
const statsDiv = document.getElementById('stats');
|
| 1067 |
const rate = tested && tested > 0 ? Math.round((relevant / tested) * 100) : 0;
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1068 |
statsDiv.innerHTML = `
|
| 1069 |
<div class="stat-item">
|
| 1070 |
<div class="stat-number">${total}</div>
|
| 1071 |
<div class="stat-label">Total Papers</div>
|
| 1072 |
</div>
|
| 1073 |
<div class="stat-item">
|
| 1074 |
-
<div class="stat-number">${tested || total}</div>
|
| 1075 |
<div class="stat-label">Tested Papers</div>
|
| 1076 |
</div>
|
| 1077 |
<div class="stat-item">
|
| 1078 |
-
<div class="stat-number">${relevant}</div>
|
| 1079 |
<div class="stat-label">Relevant Papers</div>
|
| 1080 |
</div>
|
| 1081 |
<div class="stat-item">
|
| 1082 |
-
<div class="stat-number">${rate
|
| 1083 |
<div class="stat-label">Rel. Rate</div>
|
| 1084 |
</div>
|
| 1085 |
<div class="stat-item">
|
| 1086 |
-
<div class="stat-number">${oaPercentage !== null ? oaPercentage + '%' : 'N/A'}</div>
|
| 1087 |
<div class="stat-label">Open Access</div>
|
| 1088 |
</div>
|
| 1089 |
<div class="stat-item">
|
| 1090 |
-
<div class="stat-number">${abstractPercentage !== null ? abstractPercentage + '%' : 'N/A'}</div>
|
| 1091 |
<div class="stat-label">With Abstract</div>
|
| 1092 |
</div>
|
| 1093 |
`;
|
|
@@ -1294,6 +1513,8 @@
|
|
| 1294 |
if (data.success) {
|
| 1295 |
buildHistoryIndex(data.files);
|
| 1296 |
displayHistory(data.files);
|
|
|
|
|
|
|
| 1297 |
}
|
| 1298 |
} catch (error) {
|
| 1299 |
console.error('Error loading history:', error);
|
|
@@ -1333,12 +1554,43 @@
|
|
| 1333 |
|
| 1334 |
// Display collections
|
| 1335 |
collectionsList.innerHTML = collections.map(collection => {
|
| 1336 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1337 |
const linkedFilters = filters.filter(filter => filter.source_collection === collection.work_identifier);
|
| 1338 |
|
| 1339 |
return `
|
| 1340 |
-
<div class="history-item collection-item" data-collection="${collection.work_identifier || ''}" onclick="selectCollection('${collection.filename}', '${collection.work_identifier || ''}', '${
|
| 1341 |
-
<div class="history-title">${
|
| 1342 |
<div class="history-meta">${collection.created}</div>
|
| 1343 |
<div class="history-meta">${(collection.size / 1024).toFixed(1)} KB</div>
|
| 1344 |
<div class="history-meta">${collection.total_papers || 0} PAPER${(collection.total_papers || 0) !== 1 ? 'S' : ''}</div>
|
|
@@ -1354,6 +1606,70 @@
|
|
| 1354 |
}).join('');
|
| 1355 |
}
|
| 1356 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1357 |
function selectCollection(filename, workIdentifier, title) {
|
| 1358 |
// Get filters for this collection
|
| 1359 |
const filters = historyIndex.filters[workIdentifier] || [];
|
|
@@ -1397,24 +1713,114 @@
|
|
| 1397 |
}
|
| 1398 |
|
| 1399 |
window.openCollection = async function(filename, workIdentifier) {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1400 |
try {
|
| 1401 |
const response = await fetch(`/api/load-database-file/${filename}`);
|
| 1402 |
const data = await response.json();
|
| 1403 |
if (data.success) {
|
| 1404 |
const fileData = data.data || {};
|
| 1405 |
const papers = fileData.papers || [];
|
| 1406 |
-
|
| 1407 |
-
|
| 1408 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1409 |
currentCollectionFile = filename; currentFilterFile = null; historyIndex.currentCollectionId = workIdentifier || (fileData.work_identifier || '');
|
| 1410 |
document.getElementById('collectDownload').style.display = 'block';
|
| 1411 |
document.getElementById('filterDownload').style.display = 'none';
|
| 1412 |
// Enable filter button when opening a collection
|
| 1413 |
document.getElementById('filterBtn').disabled = false;
|
|
|
|
| 1414 |
// Save papers to temp file for filtering
|
| 1415 |
collectedPapers = papers;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1416 |
}
|
| 1417 |
} catch (error) {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1418 |
alert(`Error opening collection: ${error.message}`);
|
| 1419 |
}
|
| 1420 |
}
|
|
|
|
| 24 |
|
| 25 |
.main-content {
|
| 26 |
flex: 1;
|
| 27 |
+
max-width: 60%;
|
| 28 |
margin-right: 15px;
|
| 29 |
}
|
| 30 |
|
| 31 |
.history-panel {
|
| 32 |
+
width: 220px;
|
| 33 |
border: 3px solid #ffffff;
|
| 34 |
padding: 15px;
|
| 35 |
background: #000000;
|
|
|
|
| 40 |
}
|
| 41 |
|
| 42 |
.merge-panel {
|
| 43 |
+
width: 220px;
|
| 44 |
border: 3px solid #ffffff;
|
| 45 |
padding: 15px;
|
| 46 |
background: #000000;
|
|
|
|
| 90 |
}
|
| 91 |
|
| 92 |
.filters-panel {
|
| 93 |
+
width: 220px;
|
| 94 |
border: 3px solid #ffffff;
|
| 95 |
padding: 15px;
|
| 96 |
background: #000000;
|
|
|
|
| 514 |
|
| 515 |
<div id="titleInput">
|
| 516 |
<p>Enter a paper title to search for and collect related papers.</p>
|
| 517 |
+
<input type="text" id="paperTitle" placeholder="Enter paper title..." value="" />
|
| 518 |
<button onclick="searchPapers()" id="searchBtn" style="margin-left: 10px;">Search Papers</button>
|
| 519 |
<div id="paperMatches" style="display: none; margin-top: 15px;"></div>
|
| 520 |
|
|
|
|
| 527 |
</h4>
|
| 528 |
<div id="selectedSeeds" style="min-height: 60px; border: 1px dashed #666666; padding: 10px; background: #000000;">
|
| 529 |
<div style="color: #666666; text-align: center; font-size: 0.8em;">No seed papers selected. Search and click papers to add them.</div>
|
| 530 |
+
</div>
|
| 531 |
<div style="margin-top: 10px;">
|
| 532 |
<button onclick="clearAllSeeds()" style="background: #333333; color: #ffffff; border: 1px solid #666666; padding: 5px 10px; font-size: 10px; margin-right: 10px;">Clear All</button>
|
| 533 |
<span style="font-size: 0.8em; color: #aaaaaa;">Click papers above to add/remove them from collection</span>
|
| 534 |
+
</div>
|
| 535 |
+
</div>
|
| 536 |
+
|
| 537 |
+
<!-- Collection Stats Box -->
|
| 538 |
+
<div id="collectionStatsBox" style="display: none; margin-top: 15px; border: 1px solid #666666; padding: 12px; background: #2a2a2a; border-radius: 5px;">
|
| 539 |
+
<h5 style="color: #ffffff; margin: 0 0 8px 0; font-size: 0.85em; font-weight: bold;">COLLECTION STATISTICS</h5>
|
| 540 |
+
<div id="collectionStatsContent" style="display: flex; gap: 15px; flex-wrap: wrap; font-size: 0.8em; color: #aaaaaa;">
|
| 541 |
+
<!-- Stats will be populated here -->
|
| 542 |
</div>
|
| 543 |
</div>
|
| 544 |
</div>
|
|
|
|
| 550 |
</div>
|
| 551 |
</div>
|
| 552 |
|
| 553 |
+
<!-- Step 2: Pre-Filter Papers -->
|
| 554 |
<div class="section">
|
| 555 |
+
<h2>Step 2: Pre-Filter Papers (Optional)</h2>
|
| 556 |
+
<p>Optionally filter papers by publication date and keywords before GPT analysis. Leave blank to analyze all papers.</p>
|
| 557 |
+
|
| 558 |
+
<div style="display: flex; gap: 20px; margin: 15px 0; flex-wrap: wrap;">
|
| 559 |
+
<div style="flex: 1; min-width: 200px;">
|
| 560 |
+
<label style="display: block; margin-bottom: 5px; font-weight: bold;">Publication Date Range:</label>
|
| 561 |
+
<div style="display: flex; gap: 10px; align-items: center;">
|
| 562 |
+
<input type="number" id="startYear" placeholder="Start Year" min="1900" max="2025" style="width: 80px; padding: 5px;">
|
| 563 |
+
<span>to</span>
|
| 564 |
+
<input type="number" id="endYear" placeholder="End Year" min="1900" max="2025" style="width: 80px; padding: 5px;">
|
| 565 |
+
</div>
|
| 566 |
+
</div>
|
| 567 |
+
<div style="flex: 1; min-width: 200px;">
|
| 568 |
+
<label style="display: block; margin-bottom: 5px; font-weight: bold;">Keyword Search:</label>
|
| 569 |
+
<input type="text" id="keywordFilter" placeholder="Enter keywords (comma-separated)" style="width: 100%; padding: 5px;">
|
| 570 |
+
</div>
|
| 571 |
+
</div>
|
| 572 |
+
|
| 573 |
+
<div style="margin: 10px 0;">
|
| 574 |
+
<button onclick="applyPreFilter()" id="preFilterBtn" disabled style="background: #333333; color: #ffffff; border: 1px solid #666666; padding: 8px 15px; margin-right: 10px;">Apply Pre-Filter</button>
|
| 575 |
+
<button onclick="clearPreFilter()" id="clearPreFilterBtn" disabled style="background: #666666; color: #ffffff; border: 1px solid #888888; padding: 8px 15px;">Clear Filter</button>
|
| 576 |
+
<span id="preFilterStatus" style="margin-left: 15px; font-size: 0.9em; color: #aaaaaa;"></span>
|
| 577 |
+
</div>
|
| 578 |
+
</div>
|
| 579 |
+
|
| 580 |
+
<!-- Step 3: Filter Papers -->
|
| 581 |
+
<div class="section">
|
| 582 |
+
<h2>Step 3: Filter by Research Question</h2>
|
| 583 |
<p>Enter your research question to filter the collected papers for relevance.</p>
|
| 584 |
<textarea id="researchQuestion" rows="3" placeholder="What are the main impacts of climate change on ocean circulation patterns?">What are the key aspects of just transitions in climate policy and energy systems?</textarea>
|
| 585 |
<div style="margin: 10px 0;">
|
|
|
|
| 642 |
let collectedPapers = [];
|
| 643 |
let lastDisplayedPapers = [];
|
| 644 |
let selectedSeeds = []; // Array to store multiple selected seed papers
|
| 645 |
+
let preFilteredPapers = []; // Array to store pre-filtered papers
|
| 646 |
+
let isPreFiltered = false; // Flag to track if pre-filtering is active
|
| 647 |
|
| 648 |
// Set default values when page loads
|
| 649 |
document.addEventListener('DOMContentLoaded', function() {
|
| 650 |
document.getElementById('researchQuestion').value = 'What are the key aspects of just transitions in climate policy and energy systems?';
|
| 651 |
loadHistory();
|
| 652 |
|
| 653 |
+
// Don't auto-search on page load - let user search manually
|
|
|
|
|
|
|
|
|
|
|
|
|
| 654 |
});
|
| 655 |
|
| 656 |
let currentCollectionFile = null;
|
|
|
|
| 707 |
function displayPaperMatches(matches) {
|
| 708 |
const matchesDiv = document.getElementById('paperMatches');
|
| 709 |
matchesDiv.innerHTML = `
|
| 710 |
+
<h4 style="color: #ffffff; margin-bottom: 7px; font-size: 0.63em;">SELECT PAPER:</h4>
|
| 711 |
${matches.map((match, index) => `
|
| 712 |
<div class="paper-match" data-work-id="${match.work_id}" onclick="selectPaper('${match.work_id}', this)" style="
|
| 713 |
border: 2px solid #ffffff;
|
| 714 |
+
padding: 7px;
|
| 715 |
+
margin-bottom: 6px;
|
| 716 |
cursor: pointer;
|
| 717 |
background: #000000;
|
| 718 |
transition: all 0.2s ease;
|
| 719 |
">
|
| 720 |
+
<div style="font-weight: bold; color: #ffffff; margin-bottom: 3px; font-size: 0.7em;">${match.title}</div>
|
| 721 |
+
<div style="font-size: 0.56em; color: #aaaaaa; margin-bottom: 2px;">Authors: ${match.authors}</div>
|
| 722 |
+
<div style="font-size: 0.56em; color: #aaaaaa; margin-bottom: 2px;">Year: ${match.year} | Venue: ${match.venue}</div>
|
| 723 |
+
<div style="font-size: 0.49em; color: #666666;">Relevance: ${match.relevance_score}</div>
|
| 724 |
</div>
|
| 725 |
`).join('')}
|
| 726 |
`;
|
|
|
|
| 751 |
venue: element.querySelectorAll('div')[2].textContent.split(' | ')[1].replace('Venue: ', '')
|
| 752 |
};
|
| 753 |
selectedSeeds.push(paperData);
|
| 754 |
+
element.style.background = '#444444';
|
| 755 |
+
element.style.color = '#ffffff';
|
| 756 |
element.style.borderColor = '#ffffff';
|
| 757 |
+
element.style.boxShadow = '0 0 10px rgba(255, 255, 255, 0.3)';
|
| 758 |
}
|
| 759 |
|
| 760 |
updateSeedCollectionDisplay();
|
|
|
|
| 770 |
if (selectedSeeds.length === 0) {
|
| 771 |
selectedSeedsDiv.innerHTML = '<div style="color: #666666; text-align: center; font-size: 0.8em;">No seed papers selected. Search and click papers to add them.</div>';
|
| 772 |
} else {
|
| 773 |
+
selectedSeedsDiv.innerHTML = selectedSeeds.map((seed, index) => {
|
| 774 |
+
// Format authors
|
| 775 |
+
const authorsText = Array.isArray(seed.authors) ? seed.authors.join(', ') : (seed.authors || 'Unknown authors');
|
| 776 |
+
const yearText = seed.year || 'Unknown year';
|
| 777 |
+
const venueText = seed.venue || 'Unknown venue';
|
| 778 |
+
|
| 779 |
+
// Show paper counts if available
|
| 780 |
+
let countsText = '';
|
| 781 |
+
if (seed.papers_found !== undefined) {
|
| 782 |
+
countsText = ` | Papers found: ${seed.papers_found}`;
|
| 783 |
+
if (seed.cited_papers !== undefined) {
|
| 784 |
+
countsText += ` (${seed.cited_papers} cited, ${seed.citing_papers} citing, ${seed.related_papers} related)`;
|
| 785 |
+
}
|
| 786 |
+
}
|
| 787 |
+
|
| 788 |
+
return `
|
| 789 |
+
<div style="background: #444444; border: 1px solid #888888; padding: 8px; margin: 5px 0; display: flex; justify-content: space-between; align-items: center; box-shadow: 0 0 8px rgba(255, 255, 255, 0.2);">
|
| 790 |
+
<div style="flex: 1;">
|
| 791 |
+
<div style="font-weight: bold; color: #ffffff; font-size: 0.9em;">${seed.title}</div>
|
| 792 |
+
<div style="font-size: 0.7em; color: #cccccc;">${authorsText} | ${yearText} | ${venueText}${countsText}</div>
|
| 793 |
+
</div>
|
| 794 |
+
<button onclick="removeSeed(${index})" style="background: #ff4444; color: #ffffff; border: none; padding: 4px 8px; font-size: 10px; cursor: pointer; margin-left: 10px;">×</button>
|
| 795 |
</div>
|
| 796 |
+
`;
|
| 797 |
+
}).join('');
|
|
|
|
| 798 |
}
|
| 799 |
}
|
| 800 |
|
|
|
|
| 807 |
document.querySelectorAll('.paper-match').forEach(match => {
|
| 808 |
const workId = match.getAttribute('data-work-id');
|
| 809 |
const isSelected = selectedSeeds.some(seed => seed.work_id === workId);
|
| 810 |
+
if (isSelected) {
|
| 811 |
+
match.style.background = '#444444';
|
| 812 |
+
match.style.color = '#ffffff';
|
| 813 |
+
match.style.borderColor = '#ffffff';
|
| 814 |
+
match.style.boxShadow = '0 0 10px rgba(255, 255, 255, 0.3)';
|
| 815 |
+
} else {
|
| 816 |
+
match.style.background = '#000000';
|
| 817 |
+
match.style.color = '#ffffff';
|
| 818 |
+
match.style.borderColor = '#ffffff';
|
| 819 |
+
match.style.boxShadow = 'none';
|
| 820 |
+
}
|
| 821 |
});
|
| 822 |
}
|
| 823 |
|
|
|
|
| 842 |
alert('SEED PAPERS INFO:\n\n• Search for papers by title\n• Click papers to add them to your seed collection\n• Each selected paper will be used to find related papers (cited, citing, and related works)\n• You can select multiple seed papers for a comprehensive collection\n• Click the × button to remove papers from your selection');
|
| 843 |
}
|
| 844 |
|
| 845 |
+
function applyPreFilter() {
|
| 846 |
+
console.log('applyPreFilter called. collectedPapers length:', collectedPapers.length);
|
| 847 |
+
if (collectedPapers.length === 0) {
|
| 848 |
+
alert('Please collect papers first before applying pre-filters.');
|
| 849 |
+
return;
|
| 850 |
+
}
|
| 851 |
|
| 852 |
+
const startYear = document.getElementById('startYear').value;
|
| 853 |
+
const endYear = document.getElementById('endYear').value;
|
| 854 |
+
const keywords = document.getElementById('keywordFilter').value.trim();
|
| 855 |
+
|
| 856 |
+
// Check if any filter is applied
|
| 857 |
+
if (!startYear && !endYear && !keywords) {
|
| 858 |
+
alert('Please enter at least one filter criteria (year range or keywords).');
|
| 859 |
return;
|
| 860 |
}
|
| 861 |
+
|
| 862 |
+
// Validate year range
|
| 863 |
+
if (startYear && endYear && parseInt(startYear) > parseInt(endYear)) {
|
| 864 |
+
alert('Start year cannot be greater than end year.');
|
| 865 |
+
return;
|
| 866 |
+
}
|
| 867 |
+
|
| 868 |
+
preFilteredPapers = collectedPapers.filter(paper => {
|
| 869 |
+
let matchesDate = true;
|
| 870 |
+
let matchesKeywords = true;
|
| 871 |
+
|
| 872 |
+
// Check date filter
|
| 873 |
+
if (startYear || endYear) {
|
| 874 |
+
const paperYear = paper.publication_date ? new Date(paper.publication_date).getFullYear() :
|
| 875 |
+
(paper.year ? parseInt(paper.year) : null);
|
| 876 |
+
|
| 877 |
+
if (paperYear) {
|
| 878 |
+
if (startYear && paperYear < parseInt(startYear)) matchesDate = false;
|
| 879 |
+
if (endYear && paperYear > parseInt(endYear)) matchesDate = false;
|
| 880 |
+
} else {
|
| 881 |
+
matchesDate = false; // Exclude papers without publication date
|
| 882 |
+
}
|
| 883 |
+
}
|
| 884 |
+
|
| 885 |
+
// Check keyword filter
|
| 886 |
+
if (keywords) {
|
| 887 |
+
const keywordList = keywords.split(',').map(k => k.trim().toLowerCase()).filter(k => k.length > 0);
|
| 888 |
+
if (keywordList.length > 0) {
|
| 889 |
+
const searchText = [
|
| 890 |
+
paper.title || '',
|
| 891 |
+
paper.abstract || '',
|
| 892 |
+
(paper.authors || []).join(' '),
|
| 893 |
+
(paper.venue || '')
|
| 894 |
+
].join(' ').toLowerCase();
|
| 895 |
+
|
| 896 |
+
matchesKeywords = keywordList.some(keyword => searchText.includes(keyword));
|
| 897 |
+
}
|
| 898 |
+
}
|
| 899 |
+
|
| 900 |
+
return matchesDate && matchesKeywords;
|
| 901 |
+
});
|
| 902 |
+
|
| 903 |
+
isPreFiltered = true;
|
| 904 |
+
document.getElementById('preFilterBtn').disabled = true;
|
| 905 |
+
document.getElementById('clearPreFilterBtn').disabled = false;
|
| 906 |
+
document.getElementById('filterBtn').disabled = false;
|
| 907 |
+
|
| 908 |
+
const filterStatus = document.getElementById('preFilterStatus');
|
| 909 |
+
filterStatus.textContent = `Pre-filtered: ${preFilteredPapers.length} papers (from ${collectedPapers.length} total)`;
|
| 910 |
+
filterStatus.style.color = '#0a0';
|
| 911 |
+
|
| 912 |
+
// Update the paper limit to not exceed filtered papers
|
| 913 |
+
const paperLimit = document.getElementById('paperLimit');
|
| 914 |
+
const maxLimit = Math.min(50, preFilteredPapers.length);
|
| 915 |
+
if (parseInt(paperLimit.value) > maxLimit) {
|
| 916 |
+
paperLimit.value = maxLimit;
|
| 917 |
+
}
|
| 918 |
+
paperLimit.max = maxLimit;
|
| 919 |
+
}
|
| 920 |
+
|
| 921 |
+
function clearPreFilter() {
|
| 922 |
+
preFilteredPapers = [];
|
| 923 |
+
isPreFiltered = false;
|
| 924 |
+
|
| 925 |
+
document.getElementById('startYear').value = '';
|
| 926 |
+
document.getElementById('endYear').value = '';
|
| 927 |
+
document.getElementById('keywordFilter').value = '';
|
| 928 |
+
document.getElementById('preFilterBtn').disabled = false;
|
| 929 |
+
document.getElementById('clearPreFilterBtn').disabled = true;
|
| 930 |
+
document.getElementById('filterBtn').disabled = false;
|
| 931 |
+
|
| 932 |
+
const filterStatus = document.getElementById('preFilterStatus');
|
| 933 |
+
filterStatus.textContent = '';
|
| 934 |
+
|
| 935 |
+
// Reset paper limit
|
| 936 |
+
const paperLimit = document.getElementById('paperLimit');
|
| 937 |
+
paperLimit.max = 50;
|
| 938 |
+
}
|
| 939 |
+
|
| 940 |
+
function showCollectionStats(fileData) {
|
| 941 |
+
const statsBox = document.getElementById('collectionStatsBox');
|
| 942 |
+
const statsContent = document.getElementById('collectionStatsContent');
|
| 943 |
+
|
| 944 |
+
const totalPapers = fileData.total_papers || 0;
|
| 945 |
+
const citedPapers = fileData.cited_papers || 0;
|
| 946 |
+
const citingPapers = fileData.citing_papers || 0;
|
| 947 |
+
const relatedPapers = fileData.related_papers || 0;
|
| 948 |
+
const totalSeeds = fileData.total_seeds || 0;
|
| 949 |
+
|
| 950 |
+
let statsHTML = `
|
| 951 |
+
<div><strong>Total Papers:</strong> ${totalPapers}</div>
|
| 952 |
+
<div><strong>Cited:</strong> ${citedPapers}</div>
|
| 953 |
+
<div><strong>Citing:</strong> ${citingPapers}</div>
|
| 954 |
+
<div><strong>Related:</strong> ${relatedPapers}</div>
|
| 955 |
+
`;
|
| 956 |
+
|
| 957 |
+
if (totalSeeds > 0) {
|
| 958 |
+
statsHTML += `<div><strong>Seeds:</strong> ${totalSeeds}</div>`;
|
| 959 |
+
}
|
| 960 |
+
|
| 961 |
+
if (fileData.deduplication_stats) {
|
| 962 |
+
const dedupStats = fileData.deduplication_stats;
|
| 963 |
+
statsHTML += `<div><strong>Duplicates Removed:</strong> ${dedupStats.duplicates_removed || 0}</div>`;
|
| 964 |
+
}
|
| 965 |
+
|
| 966 |
+
statsContent.innerHTML = statsHTML;
|
| 967 |
+
statsBox.style.display = 'block';
|
| 968 |
+
}
|
| 969 |
+
|
| 970 |
+
function hideCollectionStats() {
|
| 971 |
+
const statsBox = document.getElementById('collectionStatsBox');
|
| 972 |
+
statsBox.style.display = 'none';
|
| 973 |
+
}
|
| 974 |
+
|
| 975 |
+
async function collectPapers() {
|
| 976 |
+
const paperTitle = document.getElementById('paperTitle').value.trim();
|
| 977 |
+
|
| 978 |
+
if (!paperTitle) {
|
| 979 |
+
showStatus('collectStatus', 'Please enter a paper title', 'error');
|
| 980 |
+
return;
|
| 981 |
+
}
|
| 982 |
if (selectedSeeds.length === 0) {
|
| 983 |
showStatus('collectStatus', 'Please search and select at least one paper first', 'error');
|
| 984 |
+
return;
|
| 985 |
}
|
| 986 |
|
| 987 |
const collectBtn = document.getElementById('collectBtn');
|
|
|
|
| 1060 |
|
| 1061 |
// Show appropriate completion message
|
| 1062 |
if (result && result.total_seeds && result.total_seeds > 1) {
|
| 1063 |
+
progressMessage.textContent = `Multi-Seed completed! (${result.total_seeds} seeds)`;
|
| 1064 |
} else {
|
| 1065 |
progressMessage.textContent = 'Collection completed!';
|
| 1066 |
}
|
|
|
|
| 1075 |
let breakdown = `${result.cited_papers} cited + ${result.citing_papers} citing + ${result.related_papers} related`;
|
| 1076 |
|
| 1077 |
if (result.total_seeds && result.total_seeds > 1) {
|
| 1078 |
+
collectionType = 'Multi-Seed ';
|
| 1079 |
const dedupStats = result.deduplication_stats;
|
| 1080 |
if (dedupStats) {
|
| 1081 |
breakdown += ` (${dedupStats.duplicates_removed} duplicates removed)`;
|
|
|
|
| 1084 |
|
| 1085 |
showStatus('collectStatus', `Successfully completed ${collectionType} - ${result.total_papers} papers (${breakdown})`, 'success');
|
| 1086 |
document.getElementById('filterBtn').disabled = false;
|
| 1087 |
+
document.getElementById('preFilterBtn').disabled = false;
|
| 1088 |
document.getElementById('resultsSection').style.display = 'block';
|
| 1089 |
updateStats(result.total_papers, 0, result.cited_papers, result.citing_papers, result.related_papers);
|
| 1090 |
currentCollectionFile = result.db_filename || null;
|
| 1091 |
historyIndex.currentCollectionId = result.work_id ? (result.work_id.replace('https://api.openalex.org/works/','').replace('https://openalex.org/','')) : null;
|
| 1092 |
document.getElementById('collectDownload').style.display = currentCollectionFile ? 'block' : 'none';
|
| 1093 |
|
| 1094 |
+
// Clear any existing pre-filter status when new collection is completed
|
| 1095 |
+
const preFilterStatus = document.getElementById('preFilterStatus');
|
| 1096 |
+
preFilterStatus.textContent = '';
|
| 1097 |
+
preFilterStatus.style.color = '#aaaaaa';
|
| 1098 |
+
|
| 1099 |
+
// Reset pre-filter state
|
| 1100 |
+
preFilteredPapers = [];
|
| 1101 |
+
isPreFiltered = false;
|
| 1102 |
+
document.getElementById('clearPreFilterBtn').disabled = true;
|
| 1103 |
+
|
| 1104 |
+
// Debug: Log the collectedPapers length
|
| 1105 |
+
console.log('Collection completed. collectedPapers length:', collectedPapers.length);
|
| 1106 |
+
|
| 1107 |
// Reset button
|
| 1108 |
document.getElementById('collectBtn').disabled = false;
|
| 1109 |
document.getElementById('collectBtn').textContent = 'Collect Papers';
|
|
|
|
| 1132 |
progressFill.style.width = `${progressPercent}%`;
|
| 1133 |
progressText.textContent = `${Math.round(progressPercent)}%`;
|
| 1134 |
|
| 1135 |
+
// Show detailed progress for Multi-Seed s
|
| 1136 |
if (progress.total_seeds && progress.total_seeds > 1) {
|
| 1137 |
const currentSeed = progress.current_seed || 0;
|
| 1138 |
const remainingSeeds = progress.remaining_seeds || 0;
|
| 1139 |
const totalSeeds = progress.total_seeds || 0;
|
| 1140 |
progressMessage.textContent = `${progress.message || 'Processing...'} (Seed ${currentSeed}/${totalSeeds}, ${remainingSeeds} remaining)`;
|
| 1141 |
} else {
|
| 1142 |
+
progressMessage.textContent = progress.message || 'Processing...';
|
| 1143 |
}
|
| 1144 |
}
|
| 1145 |
} catch (error) {
|
|
|
|
| 1157 |
return;
|
| 1158 |
}
|
| 1159 |
|
| 1160 |
+
// Use pre-filtered papers if available, otherwise use all collected papers
|
| 1161 |
+
const papersToAnalyze = isPreFiltered ? preFilteredPapers : collectedPapers;
|
| 1162 |
+
|
| 1163 |
+
if (papersToAnalyze.length === 0) {
|
| 1164 |
+
showStatus('filterStatus', 'No papers match the pre-filter criteria', 'error');
|
| 1165 |
+
return;
|
| 1166 |
+
}
|
| 1167 |
+
|
| 1168 |
// Check if user wants to analyze more than 50 papers
|
| 1169 |
if (paperLimit > 50) {
|
| 1170 |
const userApiKey = prompt(`You want to analyze ${paperLimit} papers, which exceeds the limit of 50.\n\nPlease provide your own OpenAI API key to continue:\n\n(Your API key will be used only for this analysis and not stored)`);
|
|
|
|
| 1206 |
research_question: researchQuestion,
|
| 1207 |
limit: paperLimit,
|
| 1208 |
source_collection: historyIndex.currentCollectionId || null,
|
| 1209 |
+
papers: papersToAnalyze.length > 0 ? papersToAnalyze : null,
|
| 1210 |
user_api_key: window.userApiKey || null
|
| 1211 |
})
|
| 1212 |
});
|
|
|
|
| 1214 |
const data = await response.json();
|
| 1215 |
|
| 1216 |
if (data.success) {
|
| 1217 |
+
// Hide collection stats box and show results section when filtering
|
| 1218 |
+
hideCollectionStats();
|
| 1219 |
+
document.getElementById('resultsSection').style.display = 'block';
|
| 1220 |
+
|
| 1221 |
// Simulate progress for filtering (since it's synchronous in backend)
|
| 1222 |
let progress = 0;
|
| 1223 |
const progressInterval = setInterval(() => {
|
|
|
|
| 1280 |
function updateStats(total, relevant, cited = 0, citing = 0, related = 0, relevantAbs = null, totalAbs = null, tested = null, oaPercentage = null, abstractPercentage = null) {
|
| 1281 |
const statsDiv = document.getElementById('stats');
|
| 1282 |
const rate = tested && tested > 0 ? Math.round((relevant / tested) * 100) : 0;
|
| 1283 |
+
|
| 1284 |
+
// Show NA for unfiltered collections (when no filtering has been applied)
|
| 1285 |
+
const showNA = relevant === 0 && tested === 0;
|
| 1286 |
+
|
| 1287 |
statsDiv.innerHTML = `
|
| 1288 |
<div class="stat-item">
|
| 1289 |
<div class="stat-number">${total}</div>
|
| 1290 |
<div class="stat-label">Total Papers</div>
|
| 1291 |
</div>
|
| 1292 |
<div class="stat-item">
|
| 1293 |
+
<div class="stat-number">${showNA ? 'N/A' : (tested || total)}</div>
|
| 1294 |
<div class="stat-label">Tested Papers</div>
|
| 1295 |
</div>
|
| 1296 |
<div class="stat-item">
|
| 1297 |
+
<div class="stat-number">${showNA ? 'N/A' : relevant}</div>
|
| 1298 |
<div class="stat-label">Relevant Papers</div>
|
| 1299 |
</div>
|
| 1300 |
<div class="stat-item">
|
| 1301 |
+
<div class="stat-number">${showNA ? 'N/A' : rate + '%'}</div>
|
| 1302 |
<div class="stat-label">Rel. Rate</div>
|
| 1303 |
</div>
|
| 1304 |
<div class="stat-item">
|
| 1305 |
+
<div class="stat-number">${showNA ? 'N/A' : (oaPercentage !== null ? oaPercentage + '%' : 'N/A')}</div>
|
| 1306 |
<div class="stat-label">Open Access</div>
|
| 1307 |
</div>
|
| 1308 |
<div class="stat-item">
|
| 1309 |
+
<div class="stat-number">${showNA ? 'N/A' : (abstractPercentage !== null ? abstractPercentage + '%' : 'N/A')}</div>
|
| 1310 |
<div class="stat-label">With Abstract</div>
|
| 1311 |
</div>
|
| 1312 |
`;
|
|
|
|
| 1513 |
if (data.success) {
|
| 1514 |
buildHistoryIndex(data.files);
|
| 1515 |
displayHistory(data.files);
|
| 1516 |
+
// Load detailed seed information for Multi-Seed s
|
| 1517 |
+
setTimeout(() => updateCollectionDisplayWithSeeds(), 100);
|
| 1518 |
}
|
| 1519 |
} catch (error) {
|
| 1520 |
console.error('Error loading history:', error);
|
|
|
|
| 1554 |
|
| 1555 |
// Display collections
|
| 1556 |
collectionsList.innerHTML = collections.map(collection => {
|
| 1557 |
+
// Determine collection type and display format
|
| 1558 |
+
let displayTitle = '';
|
| 1559 |
+
let tooltipText = '';
|
| 1560 |
+
|
| 1561 |
+
if (collection.type === 'merged') {
|
| 1562 |
+
displayTitle = 'Merged Collection';
|
| 1563 |
+
tooltipText = 'Multiple collections merged together';
|
| 1564 |
+
} else if (collection.type === 'multiseed') {
|
| 1565 |
+
// Always use display_title if available, otherwise create from seed results
|
| 1566 |
+
if (collection.display_title) {
|
| 1567 |
+
displayTitle = collection.display_title;
|
| 1568 |
+
} else if (collection.seed_results && collection.seed_results.length > 0) {
|
| 1569 |
+
// Create display title from seed results
|
| 1570 |
+
const seedTitles = collection.seed_results.map(seed => seed.title);
|
| 1571 |
+
if (seedTitles.length === 1) {
|
| 1572 |
+
displayTitle = seedTitles[0];
|
| 1573 |
+
} else if (seedTitles.length === 2) {
|
| 1574 |
+
displayTitle = `${seedTitles[0]} & ${seedTitles[1]}`;
|
| 1575 |
+
} else {
|
| 1576 |
+
displayTitle = `${seedTitles[0]}, ${seedTitles[1]} + ${seedTitles.length - 2} others`;
|
| 1577 |
+
}
|
| 1578 |
+
} else {
|
| 1579 |
+
// Fallback: try to get title from collection title or work_identifier
|
| 1580 |
+
displayTitle = collection.title || collection.work_identifier || 'Collection';
|
| 1581 |
+
}
|
| 1582 |
+
tooltipText = 'Collection from multiple seed papers';
|
| 1583 |
+
} else {
|
| 1584 |
+
// Single seed - show the paper title
|
| 1585 |
+
displayTitle = collection.title || collection.work_identifier || 'UNTITLED COLLECTION';
|
| 1586 |
+
tooltipText = 'Single seed paper collection';
|
| 1587 |
+
}
|
| 1588 |
+
|
| 1589 |
const linkedFilters = filters.filter(filter => filter.source_collection === collection.work_identifier);
|
| 1590 |
|
| 1591 |
return `
|
| 1592 |
+
<div class="history-item collection-item" data-collection="${collection.work_identifier || ''}" onclick="selectCollection('${collection.filename}', '${collection.work_identifier || ''}', '${displayTitle}')" draggable="true" ondragstart="dragCollection(event, '${collection.filename}', '${displayTitle}', ${collection.total_papers || 0})" title="${tooltipText}">
|
| 1593 |
+
<div class="history-title">${displayTitle}</div>
|
| 1594 |
<div class="history-meta">${collection.created}</div>
|
| 1595 |
<div class="history-meta">${(collection.size / 1024).toFixed(1)} KB</div>
|
| 1596 |
<div class="history-meta">${collection.total_papers || 0} PAPER${(collection.total_papers || 0) !== 1 ? 'S' : ''}</div>
|
|
|
|
| 1606 |
}).join('');
|
| 1607 |
}
|
| 1608 |
|
| 1609 |
+
// Function to load collection details and show seed information
|
| 1610 |
+
async function loadCollectionSeedDetails(filename, workIdentifier) {
|
| 1611 |
+
try {
|
| 1612 |
+
const response = await fetch(`/api/load-database-file/${filename}`);
|
| 1613 |
+
const data = await response.json();
|
| 1614 |
+
if (data.success) {
|
| 1615 |
+
const fileData = data.data || {};
|
| 1616 |
+
const seedResults = fileData.seed_results || [];
|
| 1617 |
+
|
| 1618 |
+
if (seedResults.length === 0) {
|
| 1619 |
+
return { title: 'No seed information available', tooltip: '' };
|
| 1620 |
+
}
|
| 1621 |
+
|
| 1622 |
+
if (seedResults.length === 1) {
|
| 1623 |
+
// Single seed
|
| 1624 |
+
const seed = seedResults[0];
|
| 1625 |
+
return {
|
| 1626 |
+
title: `${seed.title} (${seed.year || 'Unknown year'})`,
|
| 1627 |
+
tooltip: `Single seed: ${seed.title}`
|
| 1628 |
+
};
|
| 1629 |
+
} else if (seedResults.length === 2) {
|
| 1630 |
+
// Two seeds
|
| 1631 |
+
const seed1 = seedResults[0];
|
| 1632 |
+
const seed2 = seedResults[1];
|
| 1633 |
+
return {
|
| 1634 |
+
title: `${seed1.title} & ${seed2.title}`,
|
| 1635 |
+
tooltip: `Two seeds: ${seed1.title} & ${seed2.title}`
|
| 1636 |
+
};
|
| 1637 |
+
} else {
|
| 1638 |
+
// Multiple seeds - show first 2 + count
|
| 1639 |
+
const seed1 = seedResults[0];
|
| 1640 |
+
const seed2 = seedResults[1];
|
| 1641 |
+
const remaining = seedResults.length - 2;
|
| 1642 |
+
const allTitles = seedResults.map(s => s.title).join('\n• ');
|
| 1643 |
+
return {
|
| 1644 |
+
title: `${seed1.title}, ${seed2.title} + ${remaining} others`,
|
| 1645 |
+
tooltip: `Multi-Seed (${seedResults.length} seeds):\n• ${allTitles}`
|
| 1646 |
+
};
|
| 1647 |
+
}
|
| 1648 |
+
}
|
| 1649 |
+
} catch (error) {
|
| 1650 |
+
console.error('Error loading collection details:', error);
|
| 1651 |
+
}
|
| 1652 |
+
return { title: 'Multi-Seed ', tooltip: 'Collection from multiple seed papers' };
|
| 1653 |
+
}
|
| 1654 |
+
|
| 1655 |
+
// Function to update collection display with detailed seed information
|
| 1656 |
+
async function updateCollectionDisplayWithSeeds() {
|
| 1657 |
+
const collectionItems = document.querySelectorAll('.collection-item[data-collection]');
|
| 1658 |
+
|
| 1659 |
+
for (const item of collectionItems) {
|
| 1660 |
+
const filename = item.getAttribute('onclick').match(/'([^']+)'/)[1];
|
| 1661 |
+
const workIdentifier = item.getAttribute('data-collection');
|
| 1662 |
+
|
| 1663 |
+
// Check if this is a Multi-Seed
|
| 1664 |
+
const titleElement = item.querySelector('.history-title');
|
| 1665 |
+
if (titleElement.textContent === 'Multi-Seed ') {
|
| 1666 |
+
const seedDetails = await loadCollectionSeedDetails(filename, workIdentifier);
|
| 1667 |
+
titleElement.textContent = seedDetails.title;
|
| 1668 |
+
titleElement.title = seedDetails.tooltip;
|
| 1669 |
+
}
|
| 1670 |
+
}
|
| 1671 |
+
}
|
| 1672 |
+
|
| 1673 |
function selectCollection(filename, workIdentifier, title) {
|
| 1674 |
// Get filters for this collection
|
| 1675 |
const filters = historyIndex.filters[workIdentifier] || [];
|
|
|
|
| 1713 |
}
|
| 1714 |
|
| 1715 |
window.openCollection = async function(filename, workIdentifier) {
|
| 1716 |
+
// Show loading indicator
|
| 1717 |
+
const loadingIndicator = document.createElement('div');
|
| 1718 |
+
loadingIndicator.id = 'collectionLoadingIndicator';
|
| 1719 |
+
loadingIndicator.innerHTML = `
|
| 1720 |
+
<div style="display: flex; align-items: center; justify-content: center; padding: 20px; background: #1a1a1a; border: 2px solid #00ff00; border-radius: 10px; margin: 20px 0;">
|
| 1721 |
+
<div style="width: 12px; height: 12px; background: #00ff00; border-radius: 50%; margin-right: 15px; animation: pulse 1.5s infinite;"></div>
|
| 1722 |
+
<span style="color: #00ff00; font-weight: bold; font-size: 16px;">Loading collection...</span>
|
| 1723 |
+
</div>
|
| 1724 |
+
`;
|
| 1725 |
+
|
| 1726 |
+
// Add CSS for pulsing animation
|
| 1727 |
+
if (!document.getElementById('loadingAnimationCSS')) {
|
| 1728 |
+
const style = document.createElement('style');
|
| 1729 |
+
style.id = 'loadingAnimationCSS';
|
| 1730 |
+
style.textContent = `
|
| 1731 |
+
@keyframes pulse {
|
| 1732 |
+
0% { opacity: 1; transform: scale(1); }
|
| 1733 |
+
50% { opacity: 0.5; transform: scale(1.2); }
|
| 1734 |
+
100% { opacity: 1; transform: scale(1); }
|
| 1735 |
+
}
|
| 1736 |
+
`;
|
| 1737 |
+
document.head.appendChild(style);
|
| 1738 |
+
}
|
| 1739 |
+
|
| 1740 |
+
// Insert loading indicator before results section
|
| 1741 |
+
const resultsSection = document.getElementById('resultsSection');
|
| 1742 |
+
resultsSection.parentNode.insertBefore(loadingIndicator, resultsSection);
|
| 1743 |
+
|
| 1744 |
try {
|
| 1745 |
const response = await fetch(`/api/load-database-file/${filename}`);
|
| 1746 |
const data = await response.json();
|
| 1747 |
if (data.success) {
|
| 1748 |
const fileData = data.data || {};
|
| 1749 |
const papers = fileData.papers || [];
|
| 1750 |
+
|
| 1751 |
+
// Clear any existing collection status messages
|
| 1752 |
+
const collectStatus = document.getElementById('collectStatus');
|
| 1753 |
+
collectStatus.textContent = '';
|
| 1754 |
+
collectStatus.style.color = '#aaaaaa';
|
| 1755 |
+
|
| 1756 |
+
// Show collection stats box instead of results section
|
| 1757 |
+
showCollectionStats(fileData);
|
| 1758 |
+
|
| 1759 |
+
// Hide results section for unfiltered collections
|
| 1760 |
+
document.getElementById('resultsSection').style.display = 'none';
|
| 1761 |
+
|
| 1762 |
currentCollectionFile = filename; currentFilterFile = null; historyIndex.currentCollectionId = workIdentifier || (fileData.work_identifier || '');
|
| 1763 |
document.getElementById('collectDownload').style.display = 'block';
|
| 1764 |
document.getElementById('filterDownload').style.display = 'none';
|
| 1765 |
// Enable filter button when opening a collection
|
| 1766 |
document.getElementById('filterBtn').disabled = false;
|
| 1767 |
+
document.getElementById('preFilterBtn').disabled = false;
|
| 1768 |
// Save papers to temp file for filtering
|
| 1769 |
collectedPapers = papers;
|
| 1770 |
+
|
| 1771 |
+
// Populate seed papers if this is a Multi-Seed
|
| 1772 |
+
if (fileData.seed_results && fileData.seed_results.length > 0) {
|
| 1773 |
+
selectedSeeds = fileData.seed_results.map(seed => ({
|
| 1774 |
+
work_id: seed.work_id,
|
| 1775 |
+
title: seed.title,
|
| 1776 |
+
authors: seed.authors || [],
|
| 1777 |
+
year: seed.year || '',
|
| 1778 |
+
venue: seed.venue || '',
|
| 1779 |
+
papers_found: seed.papers_found || 0,
|
| 1780 |
+
cited_papers: seed.cited_papers || 0,
|
| 1781 |
+
citing_papers: seed.citing_papers || 0,
|
| 1782 |
+
related_papers: seed.related_papers || 0
|
| 1783 |
+
}));
|
| 1784 |
+
updateSeedCollectionDisplay();
|
| 1785 |
+
updateCollectButton();
|
| 1786 |
+
} else {
|
| 1787 |
+
// Clear seed papers for single-seed collections
|
| 1788 |
+
selectedSeeds = [];
|
| 1789 |
+
updateSeedCollectionDisplay();
|
| 1790 |
+
updateCollectButton();
|
| 1791 |
+
}
|
| 1792 |
+
|
| 1793 |
+
// Update loading indicator to show loaded status
|
| 1794 |
+
loadingIndicator.innerHTML = `
|
| 1795 |
+
<div style="display: flex; align-items: center; justify-content: center; padding: 20px; background: #1a1a1a; border: 2px solid #00ff00; border-radius: 10px; margin: 20px 0;">
|
| 1796 |
+
<div style="width: 12px; height: 12px; background: #00ff00; border-radius: 50%; margin-right: 15px;"></div>
|
| 1797 |
+
<span style="color: #00ff00; font-weight: bold; font-size: 16px;">Collection loaded successfully!</span>
|
| 1798 |
+
</div>
|
| 1799 |
+
`;
|
| 1800 |
+
|
| 1801 |
+
// Remove loading indicator after 2 seconds
|
| 1802 |
+
setTimeout(() => {
|
| 1803 |
+
if (loadingIndicator.parentNode) {
|
| 1804 |
+
loadingIndicator.parentNode.removeChild(loadingIndicator);
|
| 1805 |
+
}
|
| 1806 |
+
}, 2000);
|
| 1807 |
}
|
| 1808 |
} catch (error) {
|
| 1809 |
+
// Update loading indicator to show error
|
| 1810 |
+
loadingIndicator.innerHTML = `
|
| 1811 |
+
<div style="display: flex; align-items: center; justify-content: center; padding: 20px; background: #1a1a1a; border: 2px solid #ff0000; border-radius: 10px; margin: 20px 0;">
|
| 1812 |
+
<div style="width: 12px; height: 12px; background: #ff0000; border-radius: 50%; margin-right: 15px;"></div>
|
| 1813 |
+
<span style="color: #ff0000; font-weight: bold; font-size: 16px;">Error loading collection</span>
|
| 1814 |
+
</div>
|
| 1815 |
+
`;
|
| 1816 |
+
|
| 1817 |
+
// Remove error indicator after 3 seconds
|
| 1818 |
+
setTimeout(() => {
|
| 1819 |
+
if (loadingIndicator.parentNode) {
|
| 1820 |
+
loadingIndicator.parentNode.removeChild(loadingIndicator);
|
| 1821 |
+
}
|
| 1822 |
+
}, 3000);
|
| 1823 |
+
|
| 1824 |
alert(`Error opening collection: ${error.message}`);
|
| 1825 |
}
|
| 1826 |
}
|