Update README.md

#3
by himanshu17HF - opened
Files changed (1) hide show
  1. README.md +95 -1
README.md CHANGED
@@ -12,4 +12,98 @@ short_description: An Open Source Cyber Security Agent
12
  license: apache-2.0
13
  ---
14
 
15
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  license: apache-2.0
13
  ---
14
 
15
+ # openMythos 🌌
16
+
17
+ **Paste your codebase. Our AI security agent audits the repository** β€” a multi-level vulnerability analysis, a visual dependency risk path, a declared threat level β€” then generates an instant, verifiable hotfix patch before threat actors can exploit it.
18
+
19
+ Built during the **Hugging Face Small Gradio Hackathon**, openMythos democratizes cutting-edge security auditing. It bridges an immersive retro terminal interface with the elite agentic reasoning and long-context preservation architecture of a fine-tuned dense model.
20
+
21
+ > ⚠️ **Proactive Defense.** This platform is engineered for defensive security intelligence. It aims to discover flaws, memory leaks, security configurations, and input bugs instantly, empowering software engineering teams to deploy hotfixes long before a threat vector is weaponized.
22
+
23
+ ---
24
+
25
+ ## ▢️ See it in action
26
+
27
+ - **Demo video:** TODO β€” Watch the Social Media Demo Video & Technical Explainer Post
28
+ - **Social post:** TODO β€” Paste your launch post link here
29
+
30
+ ---
31
+
32
+ ## Why it's worth a look
33
+
34
+ - 🧠 **Deep Agentic Reasoning, Not a Basic RegEx Scanner.** Powered by a specialized Qwen3.6-27B foundation architecture, openMythos maps complex variable trails and dependency structures across entire software repositories during a single security sweep using its native long-context window.
35
+
36
+ - 🎨 **Immersive Retro UI.** No default Gradio look: a distraction-free retro terminal architecture optimized for low-latency code-auditing loops.
37
+
38
+ - πŸ”Œ **100% Local & Privacy-First.** Designed as a fully open-source alternative to proprietary security intelligence layers (like Claude's Mythos model). It can be run entirely locally, requiring zero internet connectivity or external dependencies to operate.
39
+
40
+ ---
41
+
42
+ ## How it works
43
+
44
+ A multi-stage engineering pipeline built around aggregated, industry-standard security sources:
45
+
46
+ | Stage | Role | Source Data / Methodology |
47
+ |:-----:|------|---------------------------|
48
+ | **1** | **Data Prep & Aggregation** | Incident reports, GitHub Advisory, VulnHub, and papers. Rigorously trained on BigVul-Filtered and Arvix-Filtered sets. |
49
+ | **2** | **Initial Fine-Tuning (SFT)** | Supervised Fine-Tuning on cybersecurity tasks. Qwen3.6-27B Base (Up to 262k+ token context window). |
50
+ | **3** | **Reinforcement Learning (RLVR)** | Verifiable Reward via vulnerable vs. fixed repo branches. Verified by a separate evaluation model checking fixes. |
51
+ | **4** | **Rigorous Evaluation** | Benchmarked against CyberGYM and SWE Bench Verified. Evaluates historical vulnerabilities and code generation. |
52
+
53
+ The entire pipeline leverages highly specialized weights to ensure an elite vulnerability discovery rate. No massive API dependencies anywhere: a clever chain of targeted engineering (**prepare β†’ SFT β†’ RLVR β†’ verify**) delivers the whole security suite.
54
+
55
+ ```
56
+ Raw Codebase Input
57
+ └─▢ Stage 1: Data Prep ─ BigVul & arXiv research paper data curation
58
+ └─▢ Stage 2: SFT Train ─ Supervised fine-tuning on targeted cybersecurity tasks
59
+ └─▢ Stage 3: RLVR Refinement ─ Reinforcement Learning via Verifiable Rewards (Vulnerable vs Fixed Code)
60
+ + CyberGYM & SWE Bench verification models
61
+ + Retro Terminal UI output
62
+ β†’ Instantly remediated source-code patch
63
+ ```
64
+
65
+ ---
66
+
67
+ ## Tech
68
+
69
+ - **Frontend:** This Gradio 6 Space using an immersive terminal configuration.
70
+ - **Base Architecture Alternative Options:** While utilizing Qwen3.6-27B, the training framework also supports Devstral-Small-2-24B, Magistral-Small, gemma-4-12B-it, and gpt-oss-20b.
71
+ - **Data Integrations:** Hardwired to ingest top-tier vulnerability streams like BigVul-Filtered and ArvixImport-Filtered-Final.
72
+
73
+ ---
74
+
75
+ ## Run it locally
76
+
77
+ ```bash
78
+ # Clone the repository and initialize the security agent interface locally
79
+ python app.py
80
+ ```
81
+
82
+ ---
83
+
84
+ ## 🀝 Project Contributors & Ecosystem Credits
85
+
86
+ Developed with ❀️ during the **Hugging Face Small Gradio Hackathon** by:
87
+
88
+ - **KingNish** – [HuggingFace Profile](https://huggingface.co/KingNish)
89
+ - **Himanshu** – [HuggingFace Profile](https://huggingface.co/Himanshu)
90
+
91
+ ---
92
+
93
+ ## πŸ“œ Citations & Academic Attributions
94
+
95
+ ```bibtex
96
+ @misc{openmythos2026,
97
+ title = {openMythos: Defensive Security Code-Auditing Agent Interface via Qwen3.6 Context Preservation},
98
+ author = {KingNish and Himanshu},
99
+ year = {2026},
100
+ howpublished = {Hugging Face Small Gradio Hackathon Project Suite}
101
+ }
102
+
103
+ @misc{qwen3.6-27b,
104
+ title = {{Qwen3.6-27B}: Flagship-Level Coding in a {27B} Dense Model},
105
+ author = {{Qwen Team}},
106
+ month = {April},
107
+ year = {2026},
108
+ url = {https://qwen.ai/blog?id=qwen3.6-27b}
109
+ }