File size: 2,062 Bytes
ddbc0c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import logging
from pathlib import Path
import os, subprocess
import sys
import torch
from scripts.quiet import install_quiet

def setup_logging(
    level=logging.INFO,
    fmt="%(message)s",
    scope="legalrag",
):
    """
    Configure logging for the LegalRAG repository.

    This function removes existing handlers attached to loggers
    under the given scope and installs a clean StreamHandler.
    """
    base = logging.getLogger(scope)
    base.setLevel(level)

    # Remove all handlers under this scope
    for name, logger in logging.root.manager.loggerDict.items():
        if name == scope or name.startswith(scope + "."):
            lg = logging.getLogger(name)
            for h in lg.handlers[:]:
                lg.removeHandler(h)
            lg.propagate = False

    # Attach a single handler to base logger
    handler = logging.StreamHandler()
    handler.setFormatter(logging.Formatter(fmt))
    base.addHandler(handler)

def run(cmd, cwd=None):
    res = subprocess.run(cmd, cwd=cwd, text=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    if res.returncode != 0:
        tail="\n".join(res.stdout.splitlines()[-40:])
        raise RuntimeError(f"Command failed: {cmd}\n--- output tail ---\n{tail}")
    return res.stdout

install_quiet()
setup_logging()
run([sys.executable, "-m", "pip", "install", "--no-cache-dir", "-r", "requirements.txt"])

# Option 1: load preprocessed law data

run([sys.executable, "-m", "data.download_data"])

torch.cuda.empty_cache()

# Option 2:  run the offline preprocessing to convert raw legal texts in data/raw/ into normalized data artifacts, retrieval indices, and a legal knowledge graph through the following steps:
# from pathlib import Path
# import os

# try:
#     !python -m scripts.preprocess_law
# except SystemExit:
#     print("Preprocessing completed.")

# try:
#     !python -m scripts.build_index
# except SystemExit:
#     print("Index building completed.")

# try:
#     !python -m scripts.build_graph
# except SystemExit:
#     print("Law graph building completed.")