Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open to Collab
24
5
48
Michael Anthony
PRO
MikeDoes
Follow
UmairQA's profile picture
theaisurf's profile picture
karimkusin's profile picture
80 followers
·
22 following
http://www.ai4privacy.com
MikeDoesDo
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
posted
an
update
about 12 hours ago
What happens when an LLM "forgets" your data? A new paper reports it might not be gone for good. The "Janus Interface" paper details a new attack that could recover forgotten PII through fine-tuning APIs. This is a solution-oriented paper because it highlights a problem that needs fixing. Testing such a high-stakes attack requires equally high-stakes data. The Ai4Privacy 300k dataset was a key part of their evaluation, providing a testbed for extracting sensitive Social Security Numbers. Our dataset, with its synthetic structured SSN data, helped the researchers at Indiana University, Stanford & CISPA, and others demonstrate that their attack works on more than just emails. It could affect highly sensitive personal identifiers. We're excited to see our open-source dataset used in such cutting-edge security research. It's a win for the community when researchers can use our resources to stress-test the safety of modern AI systems. This work is a direct and explicit call for stronger protections on fine-tuning interfaces. 🔗 This is why open data for security research is so important. Check out the full paper: https://arxiv.org/pdf/2310.15469 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/
reacted
to
their
post
with 🔥
3 days ago
How do you prove your new, specialized AI model is a better solution? You test it against the best. That's why we were excited to see the new AdminBERT paper from researchers at Nantes Université and others. To show the strength of their new model for French administrative texts, they compared it to the state-of-the-art generalist model, NERmemBERT. The direct connection to our work is clear: NERmemBERT was trained on a combination of datasets, including the Pii-masking-200k dataset by Ai4Privacy. This is a perfect win-win for the open-source community. Our foundational dataset helps create a strong, general-purpose benchmark, which in turn helps researchers prove the value of their specialized work. This is how we all get better. 🔗 Great work by Thomas Sebbag, Solen Quiniou, Nicolas Stucky, and Emmanuel Morin on tackling a challenging domain! Check out their paper: https://aclanthology.org/2025.coling-main.27.pdf 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
reacted
to
their
post
with 🚀
3 days ago
How do you prove your new, specialized AI model is a better solution? You test it against the best. That's why we were excited to see the new AdminBERT paper from researchers at Nantes Université and others. To show the strength of their new model for French administrative texts, they compared it to the state-of-the-art generalist model, NERmemBERT. The direct connection to our work is clear: NERmemBERT was trained on a combination of datasets, including the Pii-masking-200k dataset by Ai4Privacy. This is a perfect win-win for the open-source community. Our foundational dataset helps create a strong, general-purpose benchmark, which in turn helps researchers prove the value of their specialized work. This is how we all get better. 🔗 Great work by Thomas Sebbag, Solen Quiniou, Nicolas Stucky, and Emmanuel Morin on tackling a challenging domain! Check out their paper: https://aclanthology.org/2025.coling-main.27.pdf 🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/ #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
View all activity
Organizations
MikeDoes
's Spaces
2
Sort: Recently updated
Running
1
Terminal Visualiser
💻
Create and download styled terminal screenshots
Running
1
TKG Visualiser
🌍
Visualize workflows from TSV data