File size: 40,180 Bytes
c1a52d2
 
 
 
 
 
 
 
 
 
 
690cf25
 
 
 
 
6b97522
f0a91a6
 
04a3d2b
 
 
 
690cf25
 
 
 
b2b9458
690cf25
 
 
 
236a59c
690cf25
236a59c
690cf25
e8cbb24
 
b2b9458
 
 
690cf25
 
 
 
e8cbb24
 
236a59c
e8cbb24
 
 
 
236a59c
e8cbb24
 
 
236a59c
e8cbb24
 
 
 
236a59c
e8cbb24
 
 
 
 
236a59c
 
e8cbb24
 
 
b2b9458
690cf25
236a59c
 
e8cbb24
690cf25
 
e8cbb24
 
 
 
 
 
 
 
 
 
 
 
 
236a59c
e8cbb24
 
690cf25
e8cbb24
 
 
 
 
b2b6c73
 
e8cbb24
 
 
236a59c
4442026
11946a9
f0a91a6
b2b6c73
f0a91a6
 
 
 
 
b2b6c73
f0a91a6
 
 
 
 
b2b6c73
f0a91a6
 
 
b2b6c73
 
 
 
 
 
 
 
690cf25
 
 
 
f0d676b
 
 
 
 
 
 
 
 
 
 
 
7a7cd35
d7c7b21
6a28cf0
 
 
 
7a7cd35
15231dd
567ba93
a3c2ad8
 
567ba93
 
 
 
a3c2ad8
15231dd
690cf25
62eede0
f0d676b
 
 
690cf25
62eede0
690cf25
b2b6c73
 
 
 
 
 
 
1003f90
bee9486
1003f90
7a7cd35
1003f90
 
 
75ab99d
f46ae97
75ab99d
5f14b38
b09c9f7
6b97522
b09c9f7
 
 
5f14b38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6b97522
5f14b38
 
 
 
 
 
 
 
6b97522
 
5f14b38
 
 
 
e8cbb24
f0a91a6
 
 
 
 
 
04a3d2b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2b6c73
04a3d2b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
690cf25
0972927
690cf25
e8cbb24
690cf25
 
e8cbb24
690cf25
 
0972927
 
b5954a6
690cf25
 
 
0972927
 
b2b9458
 
0972927
b5954a6
 
236a59c
0972927
 
 
 
 
 
 
b5954a6
0972927
b5954a6
0972927
236a59c
 
1112ed5
e7dfed4
 
 
236a59c
 
9ca3799
4442026
236a59c
9ca3799
236a59c
cef7a76
236a59c
 
 
a009958
 
 
236a59c
e8cbb24
236a59c
 
446b48e
236a59c
 
 
 
 
 
6e06705
236a59c
 
9ca3799
4442026
236a59c
 
 
 
 
9ca3799
236a59c
b2b6c73
236a59c
 
6e06705
236a59c
 
9ca3799
 
4442026
 
236a59c
 
 
4442026
0bff880
4442026
 
b2b6c73
4b7d2ce
5949b61
236a59c
b2b6c73
 
236a59c
b2b6c73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
236a59c
 
b2b6c73
236a59c
b2b6c73
 
 
 
 
 
 
 
236a59c
 
 
b2b9458
 
690cf25
b2b6c73
b2b9458
690cf25
b2b6c73
690cf25
 
b2b6c73
 
44c2693
b2b6c73
 
 
 
 
e8cbb24
b2b6c73
 
 
 
 
 
 
 
 
 
 
e8cbb24
b2b6c73
e8cbb24
b2b6c73
44c2693
e8cbb24
b2b6c73
 
 
 
b2b9458
f46ae97
b2b9458
b2b6c73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
690cf25
 
b2b6c73
15321a5
b2b6c73
 
 
 
 
 
 
 
 
 
 
 
 
 
690cf25
b2b6c73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6b97522
b2b6c73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2b9458
 
e8cbb24
b2b9458
 
 
e8cbb24
 
b2b9458
 
e8cbb24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
"""
Acknowledgement:
The Smart Context Citation (SCC) system was developed by Dr. Majed Abuseif. 
The text fragments scroll functionality in the HTML environment is based on Burris and Bokan (2023). 
Conceptualisation, primary coding, integration, project management, and finalisation were undertaken by Dr. Majed Abuseif. 
Project planning, execution plan, and review were conducted by Dr. Majed Abuseif in collaboration with ChatGPT. 
Formatting and design were contributed by Dr. Majed Abuseif and Manus. 
Programming contributions were provided by Dr. Majed Abuseif and Grok, with hosting on Hugging Face. 
The code is implemented in Python.
"""

import streamlit as st
import hashlib
import urllib.parse
from datetime import datetime
import pytz
import re
import pandas as pd
import base64
import io
import openpyxl
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.worksheet.hyperlink import Hyperlink

# --- Constants ---
MELBOURNE_TIMEZONE = 'Australia/Melbourne'

# --- Custom CSS for simplified UI ---
def load_css():
    st.markdown("""
    <style>
    .main-header {
        padding: 0.5rem 0;
        text-align: center;
        margin: 0.5rem 0;
    }
    
    .citation-output {
        background: #f8f8f8;
        border: 1px solid #e0e0e0;
        border-radius: 4px;
        padding: 1rem;
        margin: 1rem 0;
        font-family: 'Courier New', monospace;
    }
    
    .warning-box {
        background: #fff3f3;
        border: 1px solid #e0e0e0;
        border-radius: 4px;
        padding: 1rem;
        margin: 1rem 0;
        color: #d32f2f;
    }
    
    .success-box {
        background: #e8f5e9;
        border: 1px solid #e0e0e0;
        border-radius: 4px;
        padding: 1rem;
        margin: 1rem 0;
        color: #2e7d32;
    }
    
    .info-card {
        background: white;
        border-radius: 4px;
        padding: 1rem;
        margin: 0.5rem 0;
        border-left: 1px solid #e0e0e0;
    }
    
    .footer {
        text-align: center;
        padding: 1rem;
        margin-top: 1rem;
        border-top: 1px solid #e0e0e0;
        font-size: 0.9rem;
    }
    
    .hash-display {
        background: #f8f8f8;
        border: 1px solid #e0e0e0;
        border-radius: 4px;
        padding: 1rem;
        font-family: 'Courier New', monospace;
        font-size: 0.85rem;
        word-break: break-all;
        margin: 0.5rem 0;
    }
    
    .tab-content {
        padding: 0.5rem 0;
    }
    
    .datetime-display {
        background: #f8f8f8;
        border-radius: 4px;
        padding: 0.8rem;
        margin: 0.5rem 0;
        border-left: 1px solid #e0e0e0;
        font-family: 'Courier New', monospace;
        font-size: 1rem;
    }
    
    .rendered-citation {
        margin: 0.5rem 0;
        font-size: 1.4rem;
    }
    
    .citation-table {
        margin: 1rem 0;
        width: 100%;
        border-collapse: collapse;
    }
    
    .citation-table th, .citation-table td {
        border: 1px solid #e0e0e0;
        padding: 0.5rem;
        text-align: left;
    }
    
    .citation-table th {
        background: #f8f8f8;
        font-weight: bold;
    }
    
    .citation-table th:nth-child(7), .citation-table td:nth-child(7) { /* Annotated Text column */
        width: 30%; /* Match SCC Index width */
    }
    
    .citation-table th:nth-child(6), .citation-table td:nth-child(6) { /* SCC Index column */
        width: 30%;
    }
    </style>
    """, unsafe_allow_html=True)

# --- Helper Functions ---
def select_longest_segment(text):
    # Split text by various dashes (hyphen, non-breaking hyphen, en dash, em dash)
    dash_variants = ['\u002D', '\u2011', '\u2013', '\u2014']
    segments = [text]
    for dash in dash_variants:
        new_segments = []
        for segment in segments:
            new_segments.extend(segment.split(dash))
        segments = new_segments
    # Return the longest segment, or original text if no dashes
    return max(segments, key=len, default=text).strip()

def encode_text_fragment(text):
    # Encode text for W3C Text Fragments, preserving only regular hyphens (U+002D)
    # Non-breaking hyphens (U+2011) are encoded as %E2%80%91
    # En dashes (U+2013) are encoded as %E2%80%93
    # Em dashes (U+2014) are encoded as %E2%80%94
    return urllib.parse.quote(text, safe='-')

def generate_citation_hash(author, year, url, fragment_text, cited_text, username, task_name, current_date, current_time):
    # Normalize inputs by stripping whitespace
    fragment_text = select_longest_segment(fragment_text.strip())
    cited_text = select_longest_segment(cited_text.strip())
    task_name = task_name.strip()
    author = author.strip()
    url = url.strip()
    username = username.strip()
    data = f"{author}, {year} | {url} | {fragment_text} | {cited_text} | {username} | {task_name} | {current_date} | {current_time}"
    return hashlib.sha256(data.encode('utf-8')).hexdigest()

def format_citation_html(url, fragment_text, author, year, scc_hash):
    # Select the longest segment for the text fragment to avoid breaking the link
    selected_fragment = select_longest_segment(fragment_text)
    encoded_fragment = encode_text_fragment(selected_fragment)
    full_url = f"{url}#:~:text={encoded_fragment}"
    return f'<a href="{full_url}" data-hash="{scc_hash}">{author} ({year})</a>'

def format_citation_end_html(url, fragment_text, author, year, scc_hash):
    # Select the longest segment for the text fragment to avoid breaking the link
    selected_fragment = select_longest_segment(fragment_text)
    encoded_fragment = encode_text_fragment(selected_fragment)
    full_url = f"{url}#:~:text={encoded_fragment}"
    return f'<a href="{full_url}" data-hash="{scc_hash}">({author}, {year})</a>'

def format_metadata_html(url, author, year, scc_hash, username, task_name, current_date, current_time):
    # Use original task_name with em dashes for text fragment URL
    metadata = f"{username}{task_name}{current_date}{current_time}"
    encoded_metadata = encode_text_fragment(metadata)
    full_url = f"{url}#:~:text={encoded_metadata}"
    return f'<a href="{full_url}" data-hash="{scc_hash}">{author} ({year}). {scc_hash}</a>'

def check_for_fragment(url):
    return '#:~:text=' in url or '?utm_source=' in url

def parse_citation_text(citation_text):
    match = re.match(r'^(?:(\w[\w\s.,&et al]+)\s*\((\d{4})\)|\((\w[\w\s.,&et al]+),\s*(\d{4})\))$', citation_text.strip())
    if match:
        author = match.group(1) or match.group(3)
        year = match.group(2) or match.group(4)
        author = author.strip()
        return author, year
    return None, None

def parse_url(url):
    if not url:
        return None, None
    try:
        match = re.search(r'#:~:text=([^&]+)', url)
        fragment_text = urllib.parse.unquote(match.group(1)) if match else None
        base_url = url.split('#')[0]
        return base_url, fragment_text
    except:
        return None, None

def parse_hash_text(hash_text):
    match = re.match(r'.*?\(\d{4}\)\.\s*([0-9a-f]{64})', hash_text.strip())
    if match:
        return match.group(1)
    return None

def parse_metadata(fragment_text):
    if not fragment_text:
        return None, None, None, None
    try:
        metadata_parts = fragment_text.split('—')
        if len(metadata_parts) == 4:
            username, task_name, date, time = metadata_parts
            return username, task_name, date, time
        return None, None, None, None
    except:
        return None, None, None, None

def get_table_download_link(df, filename="citation_data.csv"):
    csv = df.to_csv(index=False)
    b64 = base64.b64encode(csv.encode()).decode()
    href = f'<a href="data:file/csv;base64,{b64}" download="{filename}">Download Citation Data as CSV</a>'
    return href

def get_excel_download_link(df, filename="citation_data.xlsx"):
    output = io.BytesIO()
    wb = openpyxl.Workbook()
    ws = wb.active
    # Write headers
    headers = df.columns.tolist()
    ws.append(headers)

    # Write data rows
    for index, row in df.iterrows():
        row_data = []
        cell_positions = {}  # track cell positions for hyperlink assignment
        urls = {}            # store URLs per column

        for col_idx, col in enumerate(headers):
            value = row[col]
            if col in ["Citation (Start of Text)", "Citation (End of Text)", "SCC Index"]:
                # Extract URL and display text from HTML anchor tag
                match = re.search(r'<a href="([^"]+)"[^>]*>([^<]+)</a>', str(value))
                if match:
                    link_url, display_text = match.groups()
                    row_data.append(display_text)
                    # Position where this cell will be written (next row after append)
                    cell_positions[col] = (ws.max_row + 1, col_idx + 1)
                    urls[col] = link_url
                else:
                    row_data.append(value)
            else:
                row_data.append(value)

        ws.append(row_data)

        # Apply hyperlinks after appending row
        for col, (r, c) in cell_positions.items():
            cell = ws.cell(row=r, column=c)
            cell.hyperlink = urls[col]
            cell.hyperlink.tooltip = "Click to visit source"
            cell.style = "Hyperlink"

    wb.save(output)
    b64 = base64.b64encode(output.getvalue()).decode()
    href = f'<a href="data:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;base64,{b64}" download="{filename}">Download citation data as Excel</a>'
    return href

# --- Streamlit App ---
st.set_page_config(layout="wide", page_title="ISNAD")

# Load custom CSS
load_css()

# Main header
st.markdown("""
<div class="main-header">
    <h1>ISNAD</h1>
    <h3 style="font-style: italic; font-weight: normal; font-size: 1.2rem;">Integrated System for Networked Attribution & Documentation</h3>
    <h4>Piloting Stages 1: Smart Context Citation (SCC)</h4>
</div>
""", unsafe_allow_html=True)

# Expandable section for About ISNAD and Example Citation
with st.expander("About ISNAD and Example Citation"):
    st.markdown("""
    <div class="info-card">
        <p>The Integrated System for Networked Attribution & Documentation (ISNAD) is a five-layer framework designed to secure, contextualise, and verify knowledge in the era of Generative AI. It draws inspiration from the classical isnad (chains of transmission), modern referencing systems, and the Swiss Cheese Model of layered protection.</p>
        <p>The current app pilots Stages 1 of ISNAD, implemented as the Smart Context Citation (SCC) system. SCC is the first operational layer of ISNAD, focused on creating verifiable, transparent, and context-rich citations. It ensures that knowledge attribution remains trustworthy and resistant to AI tampering.</p>
        <h4>Key Features of SCC (Stage 1 of ISNAD):</h4>
        <ul>
            <li><strong>Inline Citations:</strong> Author-year citations hyperlinked to the exact text fragment in the source.</li>
            <li><strong>Authenticated Citation Identifier (ACI):</strong> A cryptographic hash that secures citation integrity.</li>
            <li><strong>SCC Index:</strong> A replacement for traditional reference lists, with verifiable links and metadata (user, task, date, time).</li>
        </ul>
        <h4>Future Layers of ISNAD (Under Development):</h4>
        <ul>
            <li>Automated citation authentication workflows.</li>
            <li>Integration with writing platforms (e.g., MS Office).</li>
            <li>Linkage with digital libraries and source databases.</li>
            <li>Printable document option with preferred traditional referencing style (e.g., APA, Harvard), allowing citations to be automatically converted for easier reading in printed documents.</li>
            <li>AI-resilient verification pipelines to protect academic outputs at scale.</li>
        </ul>
        <h4>Technical Legitimacy</h4>
        The SCC style uses the W3C Text Fragments specification by <a href="https://wicg.github.io/scroll-to-text-fragment/#:~:text=Editors%3A" target="_blank">Burris and Bokan (2023)</a> to enable precise linking to specific sections of digital content. This ensures that citations are contextually accurate, verifiable, and aligned with modern digital standards.
        <h4>Acknowledgement</h4>
        Smart Context Citation (SCC) developed by Dr. Majed Abuseif, with contributions from ChatGPT, Manus-AI, and Grok, hosted on Hugging Face.
         <h4>Example Citation</h4>
        <p><strong>Inputs:</strong></p>
        <ul>
            <li><strong>Username:</strong> Majed</li>
            <li><strong>Task Name:</strong> Design Strategies for Trees on Buildings</li>
            <li><strong>Author:</strong> Abuseif et al.</li>
            <li><strong>Year:</strong> 2023</li>
            <li><strong>URL:</strong> https://www.sciencedirect.com/science/article/pii/S2772411523000046</li>
            <li><strong>Annotated Text:</strong> Fig. 3. A proposed design framework for green roof settings in general and trees on buildings in particular. The framework consists of four categories that are required to be followed in order from outside to inside.</li>
        </ul>
        <p><strong>Outputs:</strong></p>
        <ul>
            <li><strong>Citation (Start of Text):</strong> <span style="font-size: 1.2rem;"><a href="https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=Fig.%203.%20A%20proposed%20design%20framework%20for%20green%20roof%20settings%20in%20general%20and%20trees%20on%20buildings%20in%20particular.%20The%20framework%20consists%20of%20four%20categories%20that%20are%20required%20to%20be%20followed%20in%20order%20from%20outside%20to%20inside.">Abuseif et al. (2023)</a></span></li>
            <li><strong>Citation (End of Text):</strong> <span style="font-size: 1.2rem;"><a href="https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=Fig.%203.%20A%20proposed%20design%20framework%20for%20green%20roof%20settings%20in%20general%20and%20trees%20on%20buildings%20in%20particular.%20The%20framework%20consists%20of%20four%20categories%20that%20are%20required%20to%20be%20followed%20in%20order%20from%20outside%20to%20inside.">(Abuseif et al., 2023)</a></span></li>
            <li><strong>SCC Index:</strong> <span style="font-size: 0.85rem;"><a href="https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=Majed%E2%80%94Design%20Strategies%20for%20Trees%20on%20Buildings%E2%80%942025-08-07%E2%80%9422%3A53%3A15">Abuseif et al. (2023). dd344cfa83eeaec090c1b957900af960a2ea9993eef6a3fc1aca2d9881a03ea4</a></span></li>
        </ul>
    </div>
    """, unsafe_allow_html=True)

# Expandable section for SCC Style Guidelines
with st.expander("SCC Style Guidelines"):
    st.markdown("""
    <div class="info-card">
        <p>The Smart Context Citation (SCC) style ensures accurate, transparent, and verifiable citations. Follow these steps to generate and verify citations using the SCC Tool.</p>
        <h4>Generating Citations</h4>
        <ol>
            <li><strong>Access the Tool:</strong> Open the &quot;Citation Generator&quot; tab.</li>
            <li><strong>Enter User Information:</strong>
                <ul>
                    <li><strong>Username:</strong> Your unique identifier (e.g., Majed).</li>
                    <li><strong>Task Name:</strong> The project or assignment name (e.g., Design Strategies for Trees on Buildings).</li>
                </ul>
            </li>
            <li><strong>Enter Citation Information:</strong>
                <ul>
                    <li><strong>Author(s) Name:</strong> The author(s) of the source (e.g., Abuseif et al.).</li>
                    <li><strong>Publication Year:</strong> The year of publication (e.g., 2023).</li>
                    <li><strong>Source URL:</strong> The full URL of the source, without text fragments (e.g., https://www.sciencedirect.com/science/article/pii/S2772411523000046).</li>
                    <li><strong>Annotated Text:</strong> The sentence or paragraph containing the information you are referencing from the source (e.g., A proposed design framework for green roof settings in general and trees on buildings in particular). Limited to 100 words to ensure reasonable processing.</li>
                </ul>
            </li>
            <li><strong>Generate Citation:</strong> Click the &quot;Generate Citation&quot; button.</li>
            <li><strong>Copy Outputs:</strong>
                <ul>
                    <li><strong>Citation (Start of Text):</strong> Use &quot;Author (Year)&quot; for the start of a sentence (e.g., Abuseif et al. (2023)).</li>
                    <li><strong>Citation (End of Text):</strong> Use &quot;(Author, Year)&quot; for in-text citations (e.g., (Abuseif et al., 2023)).</li>
                    <li><strong>SCC Index:</strong> Copy the index link (e.g., Abuseif et al. (2023). cda7ba19e51e430107e58696758fdf79b8f016d8f27e8f8691ad713e7c8bc668) for verification.</li>
                    <li>If you would like to test the reference before using it, click on it to check whether it is suitable and captures the information you need.</li>
                </ul>
            </li>
        </ol>
        <h4>Using References and the SCC Index in Your Document</h4>
        <ol>
            <li>Paste the reference directly in the appropriate place within your document.</li>
            <li>Create an SCC Index (instead of a traditional reference list), and paste the corresponding SCC Index entry for each reference you’ve used.</li>
            <li>You can download the citation table to facilitate reference tracking for your study and research. You may also submit it as an appendix with your work.</li>
        </ol>
        <h4>Verifying Citations (for Markers and Reviewers)</h4>
        <ol>
            <li><strong>Access the Tool:</strong> Open the &quot;Verify Citation&quot; tab, which provides two options for verification: Manual Verification or Excel Upload Verification (automated using the citation table).</li>
            <li><strong>Manual Verification:</strong>
                <ul>
                    <li><strong>Enter Citation Information:</strong>
                        <ul>
                            <li><strong>Citation Text:</strong> Paste the citation text (e.g., Abuseif et al. (2023) or (Abuseif et al., 2023)).</li>
                            <li><strong>Citation URL:</strong> Paste the hyperlink URL from the citation (right-click and select &quot;Copy Link Address&quot;).</li>
                        </ul>
                    </li>
                    <li><strong>Enter SCC Index Information:</strong>
                        <ul>
                            <li><strong>SCC Index Text:</strong> Paste the index text (e.g., Abuseif et al. (2023). cda7ba19e51e430107e58696758fdf79b8f016d8f27e8f8691ad713e7c8bc668).</li>
                            <li><strong>SCC Index URL:</strong> Paste the hyperlink URL from the index (right-click and select &quot;Copy Link Address&quot;).</li>
                        </ul>
                    </li>
                    <li><strong>Verify Citation:</strong> Click the &quot;Verify Citation&quot; button in the Manual Verification tab.</li>
                    <li><strong>Review Result:</strong>
                        <ul>
                            <li><strong>Authentic Citation:</strong> Displayed in green if the hash matches, confirming integrity.</li>
                            <li><strong>Unauthentic Citation:</strong> Displayed in red if the hash does not match, indicating potential tampering.</li>
                        </ul>
                    </li>
                </ul>
            </li>
            <li><strong>Automated Verification Using Citation Table:</strong>
                <ul>
                    <li><strong>Upload Excel File:</strong> In the Excel Upload Verification tab, upload the Excel file generated from the Citation Generator tab.</li>
                    <li><strong>Verify Citations:</strong> Click the &quot;Verify Citations from Excel&quot; button.</li>
                    <li><strong>Review Results:</strong> View the verification results in a table, with each citation marked as:
                        <ul>
                            <li><strong>Authenticated:</strong> If the hash matches, confirming integrity.</li>
                            <li><strong>Unauthenticated:</strong> If the hash does not match, indicating potential issues.</li>
                        </ul>
                    </li>
                </ul>
            </li>
        </ol>
    </div>
    """, unsafe_allow_html=True)

# Tabs for Citation Generator and Verify Citation
tabs = st.tabs(["Citation Generator", "Verify Citation"])

# --- Citation Generator Tab ---
with tabs[0]:
    st.markdown('<div class="tab-content">', unsafe_allow_html=True)
    
    # User Information
    st.subheader("User Information")
    col_user1, col_user2 = st.columns(2)
    with col_user1:
        username = st.text_input("Username", help="Enter your username", placeholder="e.g., Majed")
    with col_user2:
        task_name = st.text_input("Task Name", help="Enter the project or assignment name", placeholder="e.g., Design Strategies for Trees on Buildings")
    
    # Citation Information
    st.subheader("Citation Information")
    col_citation1, col_citation2 = st.columns(2)
    with col_citation1:
        author_name = st.text_input("Author(s) Name", help="Enter the author(s) name", placeholder="e.g., Abuseif et al.")
        publication_year = st.text_input("Publication Year", help="Enter the publication year", placeholder="e.g., 2023")
        source_url = st.text_input("Source URL", help="Enter the full URL of the source (without text fragments)", placeholder="e.g., https://www.sciencedirect.com/science/article/pii/S2772411523000046")
    with col_citation2:
        annotated_text = st.text_area("Annotated Text", help="Enter the sentence or paragraph containing the referenced information (maximum 100 words)", placeholder="e.g., A proposed design framework for green roof settings...", height=150)
    
    # Generate Citation Button
    generate_button = st.button("Generate Citation", type="primary", use_container_width=True)
    
    if generate_button:
        # Validate inputs
        if not all([username, task_name, author_name, publication_year, source_url, annotated_text]):
            st.error("Please fill in all fields before generating a citation.")
        elif not re.match(r'https?://[^\s]+', source_url):
            st.error("Please enter a valid URL starting with http:// or https://")
        elif not re.match(r'^\d{4}$', publication_year):
            st.error("Please enter a valid 4-digit publication year.")
        elif check_for_fragment(source_url):
            st.error("It appears you have accessed this link through an AI assistant, such as an AI overview or generative AI writing tool. To ensure academic integrity, please return to the original source and review it carefully before incorporating it into your work. Additionally, consider exploring other relevant sources to deepen your understanding of the topic.")
        else:
            # Check word count for Annotated Text
            word_count = len(annotated_text.split())
            if word_count > 100:
                st.error("Annotated Text exceeds the maximum limit of 100 words. Please reduce the text.")
            else:
                # Get current date and time in Melbourne timezone
                melbourne_tz = pytz.timezone(MELBOURNE_TIMEZONE)
                current_time = datetime.now(melbourne_tz).strftime("%H:%M:%S")
                current_date = datetime.now(melbourne_tz).strftime("%Y-%m-%d")
                
                # Generate citation hash
                scc_hash = generate_citation_hash(
                    author_name, publication_year, source_url, annotated_text, annotated_text,
                    username, task_name, current_date, current_time
                )
                
                # Generate HTML citations
                citation_link_start = format_citation_html(source_url, annotated_text, author_name, publication_year, scc_hash)
                citation_link_end = format_citation_end_html(source_url, annotated_text, author_name, publication_year, scc_hash)
                metadata_link = format_metadata_html(source_url, author_name, publication_year, scc_hash, username, task_name, current_date, current_time)
                
                # --- Persistent Table with Clickable SCC Hash ---
                
                # First, ensure session state is initialized for the citation DataFrame
                if 'citation_df' not in st.session_state:
                    st.session_state.citation_df = pd.DataFrame(columns=[
                        "Username", "Task Name", "Time", "Date", "URL",
                        "Citation (Start of Text)", "Citation (End of Text)", "SCC Index", "Annotated Text"
                    ])
                
                # Create clickable HTML for SCC Index (full metadata link)
                clickable_index = metadata_link
                
                # Create new row data
                new_row = {
                    "Username": username,
                    "Task Name": task_name,
                    "Time": current_time,
                    "Date": current_date,
                    "URL": source_url,
                    "Citation (Start of Text)": citation_link_start,
                    "Citation (End of Text)": citation_link_end,
                    "SCC Index": clickable_index,
                    "Annotated Text": annotated_text
                }
                
                # Append the new row to the session state DataFrame
                new_df = pd.DataFrame([new_row])
                st.session_state.citation_df = pd.concat([st.session_state.citation_df, new_df], ignore_index=True)
                
                # Get the accumulated DataFrame for display and download
                df = st.session_state.citation_df
                
                col_html1, col_html2 = st.columns(2)
                
                # HTML Citation - Start of Text
                with col_html1:
                    st.markdown("### Citation (Start of Text)")
                    st.markdown('<div class="rendered-citation">', unsafe_allow_html=True)
                    st.markdown(citation_link_start, unsafe_allow_html=True)
                    st.markdown('</div>', unsafe_allow_html=True)
                
                # HTML Citation - End of Text
                with col_html2:
                    st.markdown("### Citation (End of Text)")
                    st.markdown('<div class="rendered-citation">', unsafe_allow_html=True)
                    st.markdown(citation_link_end, unsafe_allow_html=True)
                    st.markdown('</div>', unsafe_allow_html=True)
                
                # SCC Index
                st.markdown("### SCC Index")
                st.markdown(metadata_link, unsafe_allow_html=True)
                
                # Display table after SCC Index
                st.markdown("### Citation Table")
                st.markdown(get_excel_download_link(df, "citation_data.xlsx"), unsafe_allow_html=True, help="New feature to help you track your citations data. Please make sure to click on the 'Enable Editing' message at the top of the file when you open it in Excel to be able to click and copy the hyperlinked citations correctly.")
                # Display table with original columns
                display_df = df[["Username", "Task Name", "Time", "Date", "Citation (Start of Text)", "SCC Index", "Annotated Text"]]
                st.markdown(display_df.to_html(classes="citation-table", index=False, escape=False), unsafe_allow_html=True)
            
            st.markdown('</div>', unsafe_allow_html=True)

# --- Verify Citation Tab ---
with tabs[1]:
    st.markdown('<div class="tab-content">', unsafe_allow_html=True)
    verify_tabs = st.tabs(["Manual Verification", "Excel Upload Verification"])
    
    with verify_tabs[0]:
        st.subheader("Citation Information")
        citation_text = st.text_input("Citation Text", help="Paste the citation text, e.g., 'Abuseif et al. (2023)' or '(Abuseif et al., 2023)'", placeholder="e.g., Abuseif et al. (2023)", key="manual_citation_text")
        citation_url = st.text_input("Citation URL", help="Paste the hyperlink URL from the citation, e.g., 'https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=fragment'", placeholder="e.g., https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=fragment", key="manual_citation_url")
        
        st.subheader("SCC Index")
        hash_text = st.text_input("SCC Index Text", help="Paste the index text, e.g., 'Abuseif et al. (2023). <hash>'", placeholder="e.g., Abuseif et al. (2023). <hash>", key="manual_hash_text")
        hash_url = st.text_input("SCC Index URL", help="Paste the hyperlink URL from the index, e.g., 'https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=metadata'", placeholder="e.g., https://www.sciencedirect.com/science/article/pii/S2772411523000046#:~:text=metadata", key="manual_hash_url")
        
        verify_button = st.button("Verify Citation", type="primary", use_container_width=True, key="manual_verify_button")
        
        if verify_button:
            if not all([citation_text, citation_url, hash_text, hash_url]):
                st.error("Please provide all fields (citation text, citation URL, SCC index text, SCC index URL) before verifying.")
            else:
                # Parse citation text
                author, year = parse_citation_text(citation_text)
                # Parse citation URL
                citation_base_url, citation_fragment = parse_url(citation_url)
                # Parse hash text
                scc_hash = parse_hash_text(hash_text)
                # Parse hash URL
                hash_base_url, hash_fragment = parse_url(hash_url)
                # Parse metadata from hash URL fragment
                username, task_name, date, time = parse_metadata(hash_fragment)
                
                if not all([author, year, citation_base_url, citation_fragment, scc_hash, hash_base_url, username, task_name, date, time]):
                    st.error("Invalid input format. Ensure the citation text, URLs, and SCC index text are correctly pasted from the generated output.")
                elif citation_base_url != hash_base_url:
                    st.error("The citation URL and SCC index URL must point to the same base URL.")
                else:
                    # Normalize inputs by stripping whitespace
                    citation_fragment = citation_fragment.strip()
                    task_name = task_name.strip()
                    # Check for potential truncation
                    if len(citation_fragment) < 20:
                        st.markdown("""
                        <div class="warning-box">
                            <strong>Warning:</strong> The citation text fragment may be truncated, which could cause verification to fail.
                        </div>
                        """, unsafe_allow_html=True)
                    selected_citation_fragment = select_longest_segment(citation_fragment)
                    # Recompute hash
                    recomputed_hash = generate_citation_hash(
                        author, year, citation_base_url, selected_citation_fragment, selected_citation_fragment,
                        username, task_name, date, time
                    )
                    
                    if recomputed_hash == scc_hash:
                        st.markdown("""
                        <div class="success-box">
                            <strong>Authentic citation!</strong>
                        </div>
                        """, unsafe_allow_html=True)
                            
                        # Create DataFrame for citation details
                        citation_data = {
                            "Username": [username],
                            "Task Name": [task_name],
                            "Time": [time],
                            "Date": [date],
                            "URL": [citation_base_url],
                            "Author(s) Name": [author],
                            "Year": [year],
                            "Annotated Text": [citation_fragment]
                        }
                        df = pd.DataFrame(citation_data)
                            
                        # Display table
                        st.markdown("### Citations Details")
                        st.markdown(df.to_html(classes="citations-table", index=False), unsafe_allow_html=True)
                            
                        # Provide download link
                        st.markdown(get_table_download_link(df), unsafe_allow_html=True)
                    else:
                        st.markdown("""
                        <div class="warning-box">
                            <strong>Unauthentic citation</strong>
                        </div>
                        """, unsafe_allow_html=True)
    
    with verify_tabs[1]:
        st.subheader("Upload Excel File")
        uploaded_file = st.file_uploader("Choose an Excel file", type=["xlsx"], help="Upload the Excel file containing citation data (generated from the Citation Generator tab).")
        
        verify_excel_button = st.button("Verify Citations from Excel", type="primary", use_container_width=True, key="excel_verify_button")
        
        if verify_excel_button:
            if not uploaded_file:
                st.error("Please upload an Excel file before verifying.")
            else:
                try:
                    # Read Excel file with pandas
                    df = pd.read_excel(uploaded_file)
                    expected_columns = ["Username", "Task Name", "Time", "Date", "URL", "Citation (Start of Text)", "Citation (End of Text)", "SCC Index", "Annotated Text"]
                    if not all(col in df.columns for col in expected_columns):
                        st.error("The uploaded Excel file does not contain the required columns: " + ", ".join(expected_columns))
                    else:
                        results = []
                        # Iterate over rows with data
                        for row_idx in range(len(df)):
                            row = df.iloc[row_idx]
                            # Extract text data directly
                            username = str(row["Username"]) if pd.notna(row["Username"]) else ""
                            task_name = str(row["Task Name"]) if pd.notna(row["Task Name"]) else ""
                            time = str(row["Time"]) if pd.notna(row["Time"]) else ""
                            date = str(row["Date"]) if pd.notna(row["Date"]) else ""
                            base_url = str(row["URL"]) if pd.notna(row["URL"]) else ""
                            annotated_text = str(row["Annotated Text"]) if pd.notna(row["Annotated Text"]) else ""
                            citation_start_text = str(row["Citation (Start of Text)"]) if pd.notna(row["Citation (Start of Text)"]) else ""
                            citation_end_text = str(row["Citation (End of Text)"]) if pd.notna(row["Citation (End of Text)"]) else ""
                            hash_text = str(row["SCC Index"]) if pd.notna(row["SCC Index"]) else ""
                            
                            # Initialize variables for verification
                            status = "Unauthenticated"
                            author = year = scc_hash = None
                            
                            # Perform verification using either Citation (Start of Text) or Citation (End of Text)
                            citation_text = citation_start_text or citation_end_text
                            
                            if all([citation_text, hash_text, base_url, annotated_text, username, task_name, date, time]):
                                # Parse citation text for author and year
                                author, year = parse_citation_text(citation_text)
                                # Parse hash from SCC Index text
                                scc_hash = parse_hash_text(hash_text)
                                
                                if all([author, year, scc_hash]):
                                    # Use Annotated Text as citation fragment
                                    citation_fragment = annotated_text.strip()
                                    # Check for potential truncation
                                    if len(citation_fragment) < 20:
                                        st.markdown("""
                                        <div class="warning-box">
                                            <strong>Warning:</strong> The citation text fragment in row {} may be truncated, which could cause verification to fail.
                                        </div>
                                        """.format(row_idx + 2), unsafe_allow_html=True)
                                    selected_citation_fragment = select_longest_segment(citation_fragment)
                                    # Recompute hash using text data
                                    recomputed_hash = generate_citation_hash(
                                        author, year, base_url, selected_citation_fragment, selected_citation_fragment,
                                        username, task_name, date, time
                                    )
                                    if recomputed_hash == scc_hash:
                                        status = "Authenticated"
                            
                            # Store result for this row
                            results.append({
                                "Username": username,
                                "Task Name": task_name,
                                "Time": time,
                                "Date": date,
                                "URL": base_url if base_url else "N/A",
                                "Author(s) Name": author if author else "N/A",
                                "Year": year if year else "N/A",
                                "Annotated Text": annotated_text,
                                "Status": status
                            })
                        
                        # Create results DataFrame
                        results_df = pd.DataFrame(results)
                        
                        # Display results
                        st.markdown("### Citation Verification Results")
                        st.markdown(results_df.to_html(classes="citations-table", index=False), unsafe_allow_html=True)
                        st.markdown(get_table_download_link(results_df, "verified_citation_data.csv"), unsafe_allow_html=True)
                        
                        # Display success message
                        if results:
                            st.markdown("""
                            <div class="success-box">
                                <strong>Citations processed successfully!</strong>
                            </div>
                            """, unsafe_allow_html=True)
                        else:
                            st.error("No valid data found in the Excel file.")
                
                except Exception as e:
                    st.error(f"Error processing Excel file: {str(e)}")
    
    st.markdown('</div>', unsafe_allow_html=True)

# Footer
st.markdown("""
<div class="footer">
    Developed by: Dr Majed Abuseif<br>
    School of Architecture and Built Environment<br>
    Deakin University<br>
    © 2025
</div>
""", unsafe_allow_html=True)