Spaces:

Agents-MCP-Hackathon
/

talosav

Runtime error

App Files Files Community

gxenos commited on Jun 9, 2025

Commit

a7c80d4

1 Parent(s): 650f87b

first commit

Browse files

Files changed (4) hide show

README.md +114 -3
app.py +12 -0
interface.py +157 -0
mcp_server.py +353 -0

README.md CHANGED Viewed

@@ -1,14 +1,125 @@
 ---
 title: Talosav
-emoji: 🌍
 colorFrom: blue
 colorTo: yellow
 sdk: gradio
 sdk_version: 5.33.0
 app_file: app.py
-pinned: false
 license: apache-2.0
 short_description: Turn any LLM into a fully fledged Antivirus!
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Talosav
+emoji: 🛡️
 colorFrom: blue
 colorTo: yellow
 sdk: gradio
 sdk_version: 5.33.0
 app_file: app.py
+pinned: true
 license: apache-2.0
 short_description: Turn any LLM into a fully fledged Antivirus!
+tags:
+  - mcp-server-track
+  - agent-demo-track
 ---
+# Talos.AV: Antivirus Agent
+![Header Image](images/header.png)
+### This is an MCP server designed to turn any LLM into a fully fledged Anti Virus. By providing tools necesary to analyze potentially malicious files, we can transform your chatbot to a master malware analyst!
+> 🔧 Track 1: MCP Server and 🤖 Track 3: Agentic Demo Showcase
+**Demo can be viewed at [Youtube](https://youtu.be/tYrIzMEOW2Y).**
+## Key Antivirus Features
+- Malware detection on your local files
+- Cyber Threat Intelligence retrieval
+- Static analysis performed on-device
+- Calculating malware capabilities
+- Detailed reporting
+- Email notifications
+## Screenshots
+**Analyze local files and detect malware through your LLM!**
+<p float="center">
+  <img src="images/analysis.png" width="45%" style="padding:5px" />
+  <img src="images/detection.png" width="45%" style="padding:5px" />
+</p>
+**Deep inspect files and send reports to your email!**
+<p float="center">
+  <img src="images/capa.png" width="45%" style="padding:5px" />
+  <img src="images/email.png" width="45%" style="padding:5px" />
+</p>
+## Target Audience
+- 🏠 **everyday users** who want to be protected from malware
+- 🔬 **malware analysts** who want to analyse and understand a malware file
+- 🏢 **organisations** who wish to provide an additional security layer
+## Tools
+There are 10 tools in total that are being employed:
+1. Getting **file hash** for further analysis
+2. Getting the date of the **first submition** to **Virus Total**
+3. Getting a **Capa report** which showcases the executables capabilities
+4. Extracting **strings**
+5. Calculating **entropy**
+6. Matching **YARA rules**
+7. Get results from **3rd party antivuirus**
+8. Get detailed **antivirus reports**
+9. Get **sandbox** results
+10. **Sending** analysis results to specified **email**
+> Most state of the art LLMs with reasoning skills are able to understand the results from each tool and intepret them accordngly allowing for a substantial help while performing the analysis.
+## Installation
+This is designed to be used as an MCP server and is advisable to be treated as such.
+1. In a .env there should be the Virus Total key and email credentials in the form of:
+   ```
+   VT_KEY=TEST-KEY
+   MAIL_PASS=TEST-PASSWORD
+   MAIL_ADD=TEST-EMAIL-ADDRESS
+   ```
+2. Install the binary of Capa from the [official repository](https://github.com/mandiant/capa) and update the path towards it
+3. Install yara [official repository](https://github.com/Yara-Rules/rules)
+   ```
+   brew install yara  # on macos
+   apt install yara   # on linux
+   ```
+   and download compiled yara rules or compile them locally
+4. Install requirements:
+   ```
+   pip3 install -r requirements.txt
+   ```
+5. Run it:
+   ```
+   sudo python3 app.py
+   ```
+6. Connect it to your choice of chatbot as an MCP server. To make your experience better you can also connect a filesystem MCP server (Claude has its own pre-configured).
+## ⚠️ Security Considerations
+Because of potential safety concerns we have NOT uploaded malware samples in this space and the UI is not fully usable (can't upload files).
+## Notes
+> The server was designed as an on-device MCP solution as a safeguard for uploading potentially malicious software. As a result it also has the innate ability to access the filesystem.
+> Capa can be time consuming, so if the analysis is urgent it's tool can be deactivated.
+> If a file is not on Virus Total (most likely for benign files) then there will be no CTI reporting, thus the analysis will mainly be done from the local modules.
+## Credits
+George Xenos, Cybersecurity Resarcher, CTI Patras Greece
+Emmanouil Tzagakis, Software Engineer, CTI Patras Greece

app.py ADDED Viewed

	@@ -0,0 +1,12 @@

+from interface import run_interface
+from mcp_server import run_server
+from multiprocessing import Process
+if __name__ == "__main__":
+    p1 = Process(target=run_interface)
+    p2 = Process(target=run_server)
+    p1.start()
+    p2.start()
+    p1.join()
+    p2.join()

interface.py ADDED Viewed

	@@ -0,0 +1,157 @@

+import gradio as gr
+from mcp_server import *
+FANCY_OUTPUTS = [
+    "🔐 Calculate Hash", "🧵 Extract Strings", "📊 Calculate Entropy",
+    "🧬 Match Yara Rules", "🔍 Run Capa Analysis",
+    "🛡️ Get Results from 3rd party antivirus", "🧪 Get Sandbox Results"
+]
+OUTPUTS = [
+    "Calculate Hash", "Extract Strings", "Calculate Entropy",
+    "Match Yara Rules", "Run Capa Analysis",
+    "Get Results from 3rd party antivirus", "Get Sandbox Results"
+]
+def handle_file_upload(file, checked_features, email_address=None):
+    if file is None:
+        return "No file uploaded."
+    res = {}
+    file_hash = get_file_hash(file.name)
+    for i in OUTPUTS:
+        if i in checked_features:
+            if i == "Calculate Hash":
+                res[i] = get_file_hash(file.name)
+            elif i == "Extract Strings":
+                res[i] = extract_strings(file.name)
+            elif i == "Calculate Entropy":
+                res[i] = file_entropy(file.name)
+            elif i == "Match Yara Rules":
+                res[i] = run_compiled_yara(file.name)
+            elif i == "Get Results from 3rd party antivirus":
+                res[i] = get_antivirus_detailed_reports(file_hash)
+            elif i == "Get Sandbox Results":
+                res[i] = get_sandbox_detailed_reports(file_hash)
+            elif i == "Run Capa Analysis":
+                res[i] = capa_malware_analysis(file.name)
+        else:
+            res[i] = f"{i} not selected."
+    if email_address:
+        try:
+            send_email(email_address, str(res), f"Malware Analysis Results for {file.name}")
+            email_status = "Email sent successfully."
+        except Exception as e:
+            print(f"Error sending email: {str(e)}")
+            email_status = f"Failed to send email: {str(e)}"
+    else:
+        email_status = "Email not requested."
+    return [res[x] for x in OUTPUTS] + [email_status]
+def create_interface():
+    with gr.Blocks() as demo:
+        gr.HTML("""
+        <style>
+        .selected input.svelte-1e02hys{
+                background-color:#0e203f!important;
+                color: white !important;
+                fill: white !important;
+                accent-color: white !important;
+                border-color: white !important;
+        }
+        #feature_checkbox_group {
+        padding: 10px;
+        border-radius: 10px;
+        background-color:#0e203f;
+    }
+        #static_analysis_accordion {
+            background-color: #0e203f !important;
+            color: black;
+            border-radius: 8px;
+            padding: 8px;
+        }
+        #capa_analysis_accordion {
+            background-color: #0e203f !important;
+            color: black;
+            border-radius: 8px;
+            padding: 8px;
+        }
+        #cyber_threat_intelligence_accordion {
+            background-color: #0e203f !important;
+            color: black;
+            border-radius: 8px;
+            padding: 8px;
+        }
+        #email_status_box {
+            background-color: #1b3d77 !important;
+            color: black;
+            border-radius: 8px;
+            padding: 8px;
+        }
+        #submit_button {
+            background-color: #0e203f !important;
+            color: white;
+            border-radius: 8px;
+            padding: 8px;
+        }
+        #submit_button:hover {
+            background-color: #1b3d77 !important;
+            color: white;
+        }
+        </style>
+        """)
+        gr.Markdown("# Malware Analysis Toolkit")
+        gr.Image("header.png", height=150, show_label=False, show_download_button=False, container=False, elem_id="logo")
+        gr.Markdown("Analyze files using CAPA, YARA, entropy, string extraction, and VirusTotal integrations.")
+        gr.Markdown("This is created in order to be used as a MCP for malware analysis. \
+                    As a result this UI is not fully functional and is meant to be installed locally in addition to \
+                    an LLM that supports MCP connections. In order to run the MCP server you need to install and run \
+                    it locally, for instructions please refer to the README")
+        with gr.Row():
+            with gr.Column(elem_id="input_column"):
+                input_file = gr.File(label="File to be analysed")
+                feature_checklist = gr.CheckboxGroup(
+                choices=OUTPUTS,
+                label="Select Analysis Features",
+                elem_id="feature_checkbox_group"
+                )
+                send_email_checkbox = gr.Checkbox(label="Send results by email?")
+                email_input = gr.Textbox(label="Email Address", visible=False)
+                submit_button = gr.Button("Submit", elem_id="submit_button")
+            with gr.Column():
+                with gr.Accordion("Static Analysis", open=True, elem_id="static_analysis_accordion"):
+                    output_hash = gr.Textbox(label=FANCY_OUTPUTS[0], interactive=False)
+                    output_strings = gr.Textbox(label=FANCY_OUTPUTS[1], interactive=False)
+                    output_entropy = gr.Textbox(label=FANCY_OUTPUTS[2], interactive=False)
+                    output_yara = gr.Textbox(label=FANCY_OUTPUTS[3], interactive=False)
+                with gr.Accordion("Capa Analysis", open=False, elem_id="capa_analysis_accordion"):
+                    output_capa = gr.Textbox(label=FANCY_OUTPUTS[4], interactive=False)
+                with gr.Accordion("Cyber Threat Intelligence", open=False, elem_id="cyber_threat_intelligence_accordion"):
+                    output_antivirus = gr.Textbox(label=FANCY_OUTPUTS[5], interactive=False)
+                    output_sandbox = gr.Textbox(label=FANCY_OUTPUTS[6], interactive=False)
+                output_boxes = [output_hash, output_strings, output_entropy, output_yara, output_capa, output_antivirus, output_sandbox]
+                email_status = gr.Textbox(label="Email Status", interactive=False, elem_id="email_status_box")
+        send_email_checkbox.change(
+            lambda checked: gr.update(visible=checked, interactive=checked),
+            inputs=send_email_checkbox,
+            outputs=email_input
+        )
+        submit_button.click(handle_file_upload,
+                            inputs=[input_file, feature_checklist, email_input],
+                            outputs=output_boxes+ [email_status])
+    return demo
+def run_interface():
+    interface = create_interface()
+    interface.launch(server_port=7861)

mcp_server.py ADDED Viewed

	@@ -0,0 +1,353 @@

+import gradio as gr
+import subprocess
+import string
+import math
+from collections import Counter
+import hashlib
+from dotenv import load_dotenv
+import vt
+import os
+import smtplib
+from email.mime.text import MIMEText
+from email.mime.multipart import MIMEMultipart
+load_dotenv()
+VT_KEY = os.getenv('VT_KEY')
+file = None
+CAPA_PATH = "../Desktop/capa"
+YARA_RULES_PATH = "../Desktop/compiled_yara_rules"
+SENDER_EMAIL = os.getenv("MAIL_ADD")
+SENDER_PASSWORD = os.getenv("MAIL_PASS")
+def get_file_hash(filepath):
+    """
+    Get the SHA256 hash of a file.
+    Args:
+        filepath: The file path to analyze
+    Returns:
+        The SHA256 hash of the file as a string
+    """
+    sha256_hash = hashlib.sha256()
+    md5_hash = hashlib.sha256()
+    with open(filepath, 'rb') as f:
+        for byte_block in iter(lambda: f.read(4096),b""):
+            sha256_hash.update(byte_block)
+            md5_hash.update(byte_block)
+    return sha256_hash.hexdigest()
+def setup_threat_information_request(hash):
+    """
+    Never run!!! this function directly, it is used to set up the VirusTotal client
+    """
+    with vt.Client(VT_KEY) as client:
+        global file
+        file = client.get_object(f"/files/{hash}")
+    return file
+def get_first_submission_date(hash):
+    """
+    Get the first submission date of a file from VirusTotal.
+    Args:
+        hash: The SHA256 hash of the file to analyze
+    Returns:
+        The first submission date of the file as a datetime object.
+    """
+    global file
+    if file is None:
+        file = setup_threat_information_request(hash)
+    return file.first_submission_date
+def capa_malware_analysis(filepath):
+    """
+    Analyze a file for malware using Capa.
+    Can be time consuming an analysis may take several minutes. Before running
+    ask the user if they want to continue.
+    Only run this function if the user has a .exe file to analyze and specifically
+    asks for Capa.
+    Args:
+        filepath: The file path to a .exe file to analyze
+    Returns:
+        A string information about the file's malware characteristics
+    """
+    try:
+        alveos = subprocess.check_output(["sudo", CAPA_PATH, filepath])
+    except subprocess.CalledProcessError as e:
+        alveos = e.output
+    return str(alveos.decode("utf-8"))
+def extract_strings(filepath, min_length=4):
+    """
+    Extract strings from file. As they are a lot, the
+    agent can search for specific strings like ip addresses, urls, paths, etc.
+    If the strings rerurn gibberish you should check the file's entropy.
+    Args:
+        filepath: The filepath to analyze
+    Returns:
+        A list of strings extracted from the file
+    """
+    with open(filepath, 'rb') as f:
+        result = []
+        current = ''
+        for byte in f.read():
+            char = chr(byte)
+            if char in string.printable and char not in '\t\n\r\x0b\x0c':
+                current += char
+                continue
+            if len(current) >= min_length:
+                result.append(current)
+            current = ''
+        if len(current) >= min_length:
+            result.append(current)
+        if len(result) > 10**2:
+            return result[:10**2]
+    return result
+def file_entropy(filepath):
+    """
+    Calculates the Shannon entropy of the entire file content.
+    Low entropy indicates that the file is likely to be uncompressed or contains repetitive data,
+    while high entropy suggests that the file is more random and potentially encrypted or packed.
+    Args:
+        filepath: The path to the file to analyze
+    Returns:
+        The Shannon entropy of the file content as a float.
+    """
+    with open(filepath, 'rb') as f:
+        data = f.read()
+        if not data:
+            return 0.0
+        length = len(data)
+        counter = Counter(data)
+        entropy = -sum((count / length) * math.log2(count / length) for count in counter.values())
+        return entropy
+def run_compiled_yara(filepath):
+    """
+    Run compiled YARA rules against a file to detect patterns or signatures.
+    Args:
+        filepath: The path to the file to analyze
+    Returns:
+        The output of the YARA rule matching as a string.
+        If no matches are found, it returns an empty string.
+        If YARA is not installed or the rules are not compiled, it raises an error.
+        Try to explain the rules that matched the file.
+    """
+    result = subprocess.run(
+        ['yara', '-C', YARA_RULES_PATH, filepath],
+        capture_output=True,
+        text=True
+    )
+    return result.stdout.strip()
+def get_antivirus_statistics(hash):
+    """
+    Get the antivirus statistics for a file from VirusTotal.
+    This is the first step for malware analysis and you should always run it first.
+    If the user asks follow up questions or the sample is not found on virus total
+    you can continue calling the other functions.
+    Turn the output to histogram to present it to the user.
+    Args:
+        hash: The SHA256 hash of the file to analyze
+    Returns:
+        A dictionary containing the antivirus statistics, including the number of engines that detected the file as malicious.
+        Turn this into a histogram to present it to the user.
+    """
+    global file
+    if file is None:
+        file = setup_threat_information_request(hash)
+    return file.last_analysis_stats
+def get_antivirus_detailed_reports(hash):
+    """
+    Get detailed antivirus reports for a file from VirusTotal.
+    Args:
+        hash: The SHA256 hash of the file to analyze
+    Returns:
+        A dictionary containing the detailed antivirus reports, including the results of each antivirus engine that analyzed the file.
+    """
+    global file
+    if file is None:
+        file = setup_threat_information_request(hash)
+    return file.last_analysis_results
+def get_sandbox_detailed_reports(hash):
+    """
+    Get detailed sandbox reports for a file from VirusTotal.
+    Args:
+        hash: The SHA256 hash of the file to analyze
+    Returns:
+        A dictionary containing the detailed sandbox reports, including the results of each sandbox that analyzed the file.
+    """
+    global file
+    if file is None:
+        file = setup_threat_information_request(hash)
+    return file.sandbox_verdicts
+def send_email(to_email, text, subject="File Analysis Results"):
+    """
+    Send an email with the analysis results if the user requests it.
+    Args:
+        to_email: The recipient's email address
+        text: The content of the email
+        subject: The subject of the email (default is "File Analysis Results")
+    """
+    msg = MIMEText(text)
+    msg['Subject'] = subject
+    msg['From'] = SENDER_EMAIL
+    msg['To'] = to_email
+    with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server:
+        server.login(SENDER_EMAIL, SENDER_PASSWORD)
+        server.sendmail(SENDER_EMAIL, to_email, msg.as_string())
+        server.quit()
+def after_analysis_interactive_artifact():
+    """
+    This function is called after the analysis is complete.
+    You should create an INTERACTIVE ARTIFACT that showcases the results of the analysis.
+    This can include visualizations, charts, or any other relevant information.
+    Args:
+        None
+    Returns:
+        A string indicating that the analysis is complete and asking if the user wants to see the results in an interactive artifact.
+    """
+    return "Analysis complete. Do you want to see the results in an interactive artifact?"
+with gr.Blocks(title="Malware Analysis Toolkit") as demo:
+    gr.Markdown("# Malware Analysis Toolkit")
+    # gr.Image("header.png", height=150, show_label=False, show_download_button=False, container=False, elem_id="logo")
+    gr.Markdown("Analyze files using CAPA, YARA, entropy, string extraction, and VirusTotal integrations.")
+    gr.Markdown("This is created in order to be used as an MCP server for malware analysis. \
+                As a result this UI is not fully functional and is meant to be installed locally in addition to \
+                an LLM that supports MCP connections.")
+    with gr.Tabs():
+        with gr.Tab("CAPA Analysis"):
+            gr.Markdown("### Run [CAPA](https://github.com/mandiant/capa) on a file to identify potential capabilities.")
+            with gr.Row():
+                file_input = gr.Textbox(label="File Path", placeholder="Enter the path to the file")
+                analysis_output = gr.Textbox(label="CAPA Results", placeholder="Results will appear here", lines=10)
+            file_input.change(capa_malware_analysis, inputs=file_input, outputs=analysis_output)
+        with gr.Tab("String Extraction"):
+            gr.Markdown("### Extract readable strings from a file.")
+            with gr.Row():
+                string_file_input = gr.Textbox(label="File Path", placeholder="Enter the path to the file")
+                string_output = gr.Textbox(label="Extracted Strings", placeholder="Strings will appear here", lines=10)
+            string_file_input.change(extract_strings, inputs=string_file_input, outputs=string_output)
+        with gr.Tab("File Entropy"):
+            gr.Markdown("### Calculate Shannon entropy of a file to detect obfuscation.")
+            with gr.Row():
+                entropy_file_input = gr.Textbox(label="File Path", placeholder="Enter the path to the file")
+                entropy_output = gr.Number(label="Entropy Value")
+            entropy_file_input.change(file_entropy, inputs=entropy_file_input, outputs=entropy_output)
+        with gr.Tab("YARA Matching"):
+            gr.Markdown("### Match the file against precompiled YARA rules.")
+            with gr.Row():
+                yara_file_input = gr.Textbox(label="File Path", placeholder="Enter the path to the file")
+                yara_output = gr.Textbox(label="YARA Matches", placeholder="Matches will appear here", lines=10)
+            yara_file_input.change(run_compiled_yara, inputs=yara_file_input, outputs=yara_output)
+        with gr.Tab("VirusTotal - Threat Info"):
+            gr.Markdown("### Get first submission date from VirusTotal using SHA256.")
+            with gr.Row():
+                hash_input = gr.Textbox(label="SHA256 Hash")
+                submission_date_output = gr.Textbox(label="First Submission Date")
+            hash_input.change(get_first_submission_date, inputs=hash_input, outputs=submission_date_output)
+        with gr.Tab("Get File Hash"):
+            gr.Markdown("### Calculate the SHA256 hash of a given file.")
+            with gr.Row():
+                hash_file_input = gr.Textbox(label="File Path")
+                hash_output = gr.Textbox(label="SHA256 Hash", placeholder="Hash will appear here")
+            hash_file_input.change(get_file_hash, inputs=hash_file_input, outputs=hash_output)
+        with gr.Tab("VirusTotal - AV Statistics"):
+            gr.Markdown("### Get aggregated antivirus detection statistics from VirusTotal.")
+            with gr.Row():
+                vt_hash_input = gr.Textbox(label="SHA256 Hash")
+                vt_stats_output = gr.Textbox(label="Antivirus Statistics", placeholder="Statistics will appear here", lines=10)
+            vt_hash_input.change(get_antivirus_statistics, inputs=vt_hash_input, outputs=vt_stats_output)
+        with gr.Tab("VirusTotal - AV Reports"):
+            gr.Markdown("### Get detailed antivirus reports from VirusTotal.")
+            with gr.Row():
+                vt_detailed_input = gr.Textbox(label="SHA256 Hash")
+                vt_detailed_output = gr.Textbox(label="Detailed Reports", placeholder="Reports will appear here", lines=10)
+            vt_detailed_input.change(get_antivirus_detailed_reports, inputs=vt_detailed_input, outputs=vt_detailed_output)
+        with gr.Tab("VirusTotal - Sandbox Reports"):
+            gr.Markdown("### Get sandbox execution details from VirusTotal.")
+            with gr.Row():
+                vt_sandbox_input = gr.Textbox(label="SHA256 Hash")
+                vt_sandbox_output = gr.Textbox(label="Sandbox Reports", placeholder="Reports will appear here", lines=10)
+            vt_sandbox_input.change(get_sandbox_detailed_reports, inputs=vt_sandbox_input, outputs=vt_sandbox_output)
+        with gr.Tab("Send Email"):
+            gr.Markdown("### Send analysis results via email.")
+            with gr.Row():
+                email_input = gr.Textbox(label="Recipient Email", placeholder="Enter recipient's email")
+                email_text_input = gr.Textbox(label="Email Content", placeholder="Enter the content of the email", lines=5)
+                email_subject_input = gr.Textbox(label="Email Subject", placeholder="Enter the subject of the email")
+                send_email_button = gr.Button("Send Email")
+            send_email_button.click(send_email, inputs=[email_input, email_text_input, email_subject_input], outputs=None)
+        with gr.Tab("Interactive Artifact"):
+            gr.Markdown("### Create an interactive artifact after analysis.")
+            with gr.Row():
+                artifact_output = gr.Textbox(label="Interactive Artifact", placeholder="Results will appear here", lines=5)
+            artifact_output.value = after_analysis_interactive_artifact()
+def run_server():
+    demo.launch(server_port=7860 ,mcp_server=True, debug=True)
+if __name__ == "__main__":
+    run_server()