Spaces:

XQ
/

Dokumentassistent

Running

App Files Files

XQ commited on Apr 7

Commit

5ab78ea

1 Parent(s): 62a41bb

Update descriptions

Browse files

Files changed (4) hide show

.github/README.md +1 -1
README.md +1 -1
scripts/ingest.py +1 -1
src/ui/app.py +4 -4

.github/README.md CHANGED Viewed

@@ -159,5 +159,5 @@ scripts/
   ingest.py
   e2e_test.py
 tests/
-docs/                      # example PDFs (KU AI public documents)
 ```

   ingest.py
   e2e_test.py
 tests/
+docs/                      # example PDFs/texts (KU AI public documents)
 ```

README.md CHANGED Viewed

@@ -159,5 +159,5 @@ scripts/
   ingest.py
   e2e_test.py
 tests/
-docs/                      # example PDFs (KU AI public documents)
 ```

   ingest.py
   e2e_test.py
 tests/
+docs/                      # example PDFs or texts (KU AI public documents)
 ```

scripts/ingest.py CHANGED Viewed

@@ -64,7 +64,7 @@ def main() -> None:
     strategy_value = args.strategy or "recursive"
     strategy = ChunkStrategy(strategy_value)
-    logger.info("=== KU Doc Assistant — Ingestion ===")
     logger.info("Docs directory : %s", docs_dir)
     logger.info("Chunk strategy : %s", strategy.value)
     logger.info("Chunk size     : %d", settings.chunk_size)

     strategy_value = args.strategy or "recursive"
     strategy = ChunkStrategy(strategy_value)
+    logger.info("=== Doc Assistant — Ingestion ===")
     logger.info("Docs directory : %s", docs_dir)
     logger.info("Chunk strategy : %s", strategy.value)
     logger.info("Chunk size     : %d", settings.chunk_size)

src/ui/app.py CHANGED Viewed

@@ -41,7 +41,7 @@ TEXTS: dict[str, dict[str, str]] = {
         "sidebar_heading": "Om systemet",
         "sidebar_body": (
             "- **Python + FastAPI** REST-backend\n"
-            "- **Ustruktureret data** — PDF-parsing, preprocessing, "
             "tre chunking-strategier\n"
             "- **Embedding-modeller** — flersproget semantisk "
             "vektorrepræsentation\n"
@@ -63,7 +63,7 @@ TEXTS: dict[str, dict[str, str]] = {
         "title": "Dokumentassistent",
         "title_badge": "Demo",
         "subtitle": (
-            "Et dokumentintelligens-system bygget på en RAG-arkitektur, dækkende PDF-indlæsning, semantisk chunking, "
             "hybrid søgning med reranking "
             "og LLM-genererede svar med kildehenvisninger. LLM-laget er provider-agnostisk. "
             "To tilstande: en LangGraph ReAct-agent (standard) til forespørgsler der kræver flere søgetrin, "
@@ -119,7 +119,7 @@ TEXTS: dict[str, dict[str, str]] = {
         "sidebar_heading": "About the system",
         "sidebar_body": (
             "- **Python + FastAPI** REST backend\n"
-            "- **Unstructured data** — PDF parsing, preprocessing, "
             "three chunking strategies\n"
             "- **Embedding models** — multilingual semantic vector "
             "representations\n"
@@ -141,7 +141,7 @@ TEXTS: dict[str, dict[str, str]] = {
         "title": "Document Assistant",
         "title_badge": "Demo",
         "subtitle": (
-            "A document intelligence system built on a RAG architecture, covering PDF ingestion, semantic chunking, "
             "hybrid retrieval with reranking, "
             "and LLM-generated answers with source citations. The LLM layer is provider-agnostic. "
             "Two modes: a LangGraph ReAct agent (default) for queries that need multiple retrieval steps, "

         "sidebar_heading": "Om systemet",
         "sidebar_body": (
             "- **Python + FastAPI** REST-backend\n"
+            "- **Ustruktureret data** — File-parsing, preprocessing, "
             "tre chunking-strategier\n"
             "- **Embedding-modeller** — flersproget semantisk "
             "vektorrepræsentation\n"
         "title": "Dokumentassistent",
         "title_badge": "Demo",
         "subtitle": (
+            "Et dokumentintelligens-system bygget på en RAG-arkitektur, dækkende file-indlæsning, semantisk chunking, "
             "hybrid søgning med reranking "
             "og LLM-genererede svar med kildehenvisninger. LLM-laget er provider-agnostisk. "
             "To tilstande: en LangGraph ReAct-agent (standard) til forespørgsler der kræver flere søgetrin, "
         "sidebar_heading": "About the system",
         "sidebar_body": (
             "- **Python + FastAPI** REST backend\n"
+            "- **Unstructured data** — File parsing, preprocessing, "
             "three chunking strategies\n"
             "- **Embedding models** — multilingual semantic vector "
             "representations\n"
         "title": "Document Assistant",
         "title_badge": "Demo",
         "subtitle": (
+            "A document intelligence system built on a RAG architecture, covering file ingestion, semantic chunking, "
             "hybrid retrieval with reranking, "
             "and LLM-generated answers with source citations. The LLM layer is provider-agnostic. "
             "Two modes: a LangGraph ReAct agent (default) for queries that need multiple retrieval steps, "