OpenMALx

community

https://huggingface.co/openmalx

Activity Feed

AI & ML interests

Open source AI cyber/infosec knowledge and research

tegridydev

posted an update 3 months ago

Post

1806

✨ Research-Papers (various topics across AI/LLM research areas)

tegridydev/research-papers

Currently building out the foundation topics and raw .pdf research paper files

Will be processing and cleaning up and converting into high quality training datasets

Check it out, give it a like and leave a comment below or join community discussion and suggest what fields and research topics you want to see included!

1 reply

tegridydev

posted an update 3 months ago

Post

1911

Introducing OpenMALx

openmalx

Repository for Infosec and Machine Learning Resources

OpenMALx is an organization focused on the development of datasets and models for security analysis. The project objective is to provide structured data for training and evaluating large language models in a security context.

---

Technical Focus

**Dataset Formatting:** Processing raw security tool logs into instruction/response pairs for model training.
**Local Execution:** Optimizing models for local hardware to ensure data remains on-premises.
**Response Logic:** Developing structured formats for explaining security vulnerabilities and remediation steps.

Active Projects

**infosec-tool-output:** A dataset mapping static and dynamic analysis tool outputs to technical summaries.
openmalx/infosec-tool-output

**open-malsec:** A collection of text-based security threats, including phishing and social engineering samples, for classification tasks.
openmalx/open-malsec

tegridydev

updated a Space 3 months ago

README

🏆

tegridydev

published a Space 3 months ago

README

🏆

tegridydev

posted an update about 1 year ago

Post

2396

Open Source AI Agents | Github/Repo List | [2025]

https://huggingface.co/blog/tegridydev/open-source-ai-agents-directory

Check out the article & Follow, bookmark, save the tab as I will be updating it <3
(using it as my own notepad & decided i might keep it up to date if i post it here, instead of making the 15th_version of it and not saving it with a name i can remember on my desktop lol)

tegridydev

posted an update about 1 year ago

Post

1935

WTF is Fine-Tuning? (intro4devs)

Fine-tuning your LLM is like min-maxing your ARPG hero so you can push high-level dungeons and get the most out of your build/gear... Makes sense, right? 😃

Here's a cheat sheet for devs (but open to anyone!)

---

TL;DR

- Full Fine-Tuning: Max performance, high resource needs, best reliability.
- PEFT: Efficient, cost-effective, mainstream, enhanced by AutoML.
- Instruction Fine-Tuning: Ideal for command-following AI, often combined with RLHF and CoT.
- RAFT: Best for fact-grounded models with dynamic retrieval.
- RLHF: Produces ethical, high-quality conversational AI, but expensive.

Choose wisely and match your approach to your task, budget, and deployment constraints.

I just posted the full extended article here
if you want to continue reading >>>

https://huggingface.co/blog/tegridydev/fine-tuning-dev-intro-2025

tegridydev

posted an update over 1 year ago

Post

1482

Open-MalSec v0.1 – Open-Source Cybersecurity Dataset

Evening! 🫡

📂 Just uploaded an early-stage open-source cybersecurity dataset focused on phishing, scams, and malware-related text samples.

This is the base version (v0.1)—a few structured sample files. Full dataset builds will come over the next few weeks.

🔗 Dataset link:

tegridydev/open-malsec

🔍 What’s in v0.1?
A few structured scam examples (text-based)
Covers DeFi, crypto, phishing, and social engineering
Initial labelling format for scam classification

⚠️ This is not a full dataset yet (samples are currently available). Just establishing the structure + getting feedback.

📂 Current Schema & Labelling Approach
"instruction" → Task prompt (e.g., "Evaluate this message for scams")
"input" → Source & message details (e.g., Telegram post, Tweet)
"output" → Scam classification & risk indicators

🗂️ Current v0.1 Sample Categories
Crypto Scams → Meme token pump & dumps, fake DeFi projects
Phishing → Suspicious finance/social media messages
Social Engineering → Manipulative messages exploiting trust

🔜 Next Steps
- Expanding datasets with more phishing & malware examples
- Refining schema & annotation quality
- Open to feedback, contributions, and suggestions

If this is something you might find useful, bookmark/follow/like the dataset repo <3

💬 Thoughts, feedback, and ideas are always welcome! Drop a comment or DMs are open 🤙

tegridydev

posted an update over 1 year ago

Post

1437

So, what is #MechanisticInterpretability 🤔

Mechanistic Interpretability (MI) is the discipline of opening the black box of large language models (and other neural networks) to understand the underlying circuits, features and/or mechanisms that give rise to specific behaviours

Instead of treating a model as a monolithic function, we can:

1. Trace how input tokens propagate through attention heads & MLP layers
2. Identify localized “circuit motifs”
3. Develop methods to systematically break down or “edit” these circuits to confirm we understand the causal structure.

Mechanistic Interpretability aims to yield human-understandable explanations of how advanced models represent and manipulate concepts which hopefully leads to

1. Trust & Reliability
2. Safety & Alignment
3. Better Debugging / Development Insights

https://bsky.app/profile/mechanistics.bsky.social/post/3lgvvv72uls2x

1 reply

AI & ML interests

Team members 1

openmalx's activity

README

README