Cybersecurity-Panel

Sleeping

App Files Files Community

Cybersecurity-Panel / multi_llm_chatbot_backend /app /utils /README.md

Sohan Kshirsagar

Backend Documentation Addition

9fabeb7 11 months ago

preview code

Raw

History Blame Contribute Delete

4.01 kB

`app/utils` – Utility Modules for Summarization, Export, and Embeddings

This directory includes reusable tools that support the backend application with:

Chat summarization for display/export
Document extraction and cleanup
File export to TXT, DOCX, and PDF formats
File upload validation
Persona-specific vector DB with ChromaDB

These modules are loosely coupled and used across core routes, RAG logic, and export endpoints.

`chat_summary.py` – Conversation Summarization

This module provides summarization of past conversations using the LLM client.

Key Functions

generate_summary_from_messages(messages, llm, max_tokens) – Generates a formatted, bullet-style summary
format_summary_for_text_export(summary_text) – Cleans summary for export to PDF/DOCX/TXT
parse_summary_to_blocks(summary_text) – Converts summary to structured blocks (headings, lists, paragraphs)

Format Guidelines

Summaries follow a markdown-style format with:

**Section Name:** for headings
* Bullet Points for insights and recommendations
Auto-trimming and line breaks for export formatting

`chroma_client.py` – Persona-Specific Knowledge Store

A minimal ChromaDB wrapper used to store and query persona-specific documents or embeddings.

Functions

add_persona_doc(text, persona, doc_id) – Add a new chunk/document for a persona
query_persona_knowledge(query, persona) – Query ChromaDB for a persona-specific response

Notes

Uses ./chroma_storage as the default persistent path
Uses the local embedding model via get_embedding() from embedding_client.py

`document_extractor.py` – File Text Extraction

Supports extracting raw text from uploaded documents.

Supported Formats

Format	Content Type
PDF	`application/pdf`
DOCX	`application/vnd.openxmlformats-officedocument.wordprocessingml.document`
TXT	`text/plain`

Key Function

extract_text_from_file(file_bytes: bytes, content_type: str) -> str

Uses:

PyPDF2 for PDFs
docx2txt for Word documents (via temp file)
UTF-8 decoding for plain text

`file_export.py` – Export Chat & Summaries

Exports content (chat logs or summaries) to the following formats:

.txt
.docx (Word)
.pdf (ReportLab)

Key Functions

export_chat_as_file(content, format) – Unified export method (calls generate_*)
prepare_export_response() – Returns a StreamingResponse with correct content-disposition

Formatting Functions

generate_txt_file() – Simple UTF-8 stream
generate_docx_file() – Paragraph-based Word file using python-docx
generate_pdf_file() – Uses ReportLab’s Platypus for chat-style layout
generate_pdf_file_from_blocks() – Used for structured summaries (heading, lists, etc.)

All formats apply automatic cleanup and styling via:

_clean_text_for_pdf() and _render_rich_text()

`file_limits.py` – Upload Size Checks

Used to prevent users from uploading excessively large files in a session.

Configurable Limit

MAX_TOTAL_UPLOAD_MB = 10

Function

is_within_upload_limit(session_id, new_file_bytes, session_context) – Returns True if upload is within session cap

Used by routes handling document uploads.

Dependencies

These modules are used in:

Module	Depends On
`rag_manager.py`	`document_extractor`, `file_limits`
`chat_summary.py`	`llm_client`
`routes/documents.py`	`document_extractor`, `file_limits`
`routes/export.py`	`file_export`, `chat_summary`

Example Workflow

Upload File → document_extractor.py → raw text
            ↓
      file_limits.py → check quota

Chat History → chat_summary.py → formatted summary
                          ↓
                  file_export.py → TXT, DOCX, PDF

Persona Notes → chroma_client.py → embedded in ChromaDB

app/utils – Utility Modules for Summarization, Export, and Embeddings

chat_summary.py – Conversation Summarization

Key Functions

Format Guidelines

chroma_client.py – Persona-Specific Knowledge Store

Functions

Notes

document_extractor.py – File Text Extraction

Supported Formats

Key Function

file_export.py – Export Chat & Summaries

Key Functions

Formatting Functions

file_limits.py – Upload Size Checks

Configurable Limit

Function

Dependencies

Example Workflow

`app/utils` – Utility Modules for Summarization, Export, and Embeddings

`chat_summary.py` – Conversation Summarization

`chroma_client.py` – Persona-Specific Knowledge Store

`document_extractor.py` – File Text Extraction

`file_export.py` – Export Chat & Summaries

`file_limits.py` – Upload Size Checks