File size: 3,412 Bytes
d64fd55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
from pathlib import Path
from typing import List, Dict, Any
from .file_detector import detect_file_type
from .parser_tcx import parse_tcx_file
from .parser_gpx import parse_gpx_file
from .parser_fit import parse_fit_file
from .validation import validate_uploads, safe_gzip_decompress, is_xml_heuristic, ValidationError
import tempfile
import shutil
import logging

logger = logging.getLogger(__name__)


def load_runs_from_folder(folder: str) -> List[Dict[str, Any]]:
    """
    Parse all supported files from a local folder. Returns list of run dicts.
    """
    runs = []
    p = Path(folder)
    if not p.exists():
        return runs

    for f in sorted(p.iterdir()):
        file_type = detect_file_type(str(f))
        if file_type == "tcx":
            parsed = parse_tcx_file(str(f))
        elif file_type == "gpx":
            parsed = parse_gpx_file(str(f))
        elif file_type == "fit":
            parsed = parse_fit_file(str(f))
        else:
            parsed = None

        if parsed:
            runs.append(parsed)

    # sort by start_time
    runs.sort(key=lambda r: r.get("start_time") or r.get("id"))
    return runs


def load_runs_from_uploaded_files(uploaded_files) -> List[Dict[str, Any]]:
    """
    Accepts Gradio-style uploaded files (list). Writes to temp folder and parses.
    """
    tmpdir = Path(tempfile.mkdtemp(prefix="runner_ingest_"))
    runs = []
    validate_uploads(uploaded_files)

    try:
        saved = []
        for f in uploaded_files or []:
            src_path = getattr(f, "name", None) or getattr(f, "filename", None)
            if not src_path:
                continue
            src_path = str(src_path)
            filename = Path(src_path).name
            dest = tmpdir / filename

            # Special handling for .tcx.gz and .fit.gz -> safe decompression
            if filename.lower().endswith((".tcx.gz", ".fit.gz")):
                # We decompress it to a .tcx or .fit file in the temp dir
                ext_len = 3  # .gz
                decompressed_dest = tmpdir / (filename[:-ext_len])
                safe_gzip_decompress(src_path, str(decompressed_dest))
                saved.append(decompressed_dest)
            else:
                try:
                    shutil.copyfile(src_path, dest)
                except Exception:
                    # fallback: manual binary read/write
                    with open(src_path, "rb") as sf, open(dest, "wb") as df:
                        df.write(sf.read())
                saved.append(dest)

        for dest in saved:
            file_type = detect_file_type(str(dest))

            # XML heuristic check (only for TCX/GPX)
            if file_type in ("tcx", "gpx") and not is_xml_heuristic(str(dest)):
                logger.warning(f"File {dest.name} failed XML heuristic check, skipping.")
                continue
            if file_type == "tcx":
                parsed = parse_tcx_file(str(dest))
            elif file_type == "gpx":
                parsed = parse_gpx_file(str(dest))
            elif file_type == "fit":
                parsed = parse_fit_file(str(dest))
            else:
                parsed = None
            if parsed:
                runs.append(parsed)
    finally:
        try:
            shutil.rmtree(tmpdir)
        except Exception:
            pass

    runs.sort(key=lambda r: r.get("start_time") or r.get("id"))
    return runs