Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open to Collab
24
5
48
Michael Anthony
PRO
MikeDoes
Follow
John6666's profile picture
Remona20's profile picture
Mike4Privacy's profile picture
61 followers
·
21 following
http://www.aisuisse.com
MikeDoesDo
MikeDoes
AI & ML interests
Privacy, Large Language Model, Explainable
Recent Activity
posted
an
update
about 12 hours ago
Anonymizing a prompt is half the battle. Reliably de-anonymizing the response is the other. To build a truly reliable privacy pipeline, you have to test it. A new Master's thesis does just that, and our data was there for every step. We're excited to showcase this work on handling confidential data in LLM prompts from Nedim Karavdic at Mälardalen University. To build their PII anonymization pipeline, they first trained a custom NER model. We're proud that the Ai4Privacy pii-masking-200k dataset was used as the foundational training data for this critical first step. But it didn't stop there. The research also used our dataset to create the parallel data needed to train and test the generative "Seek" models for de-anonymization. It's a win-win when our open-source data not only helps build the proposed "better solution" but also helps prove why it's better by enabling a rigorous, data-driven comparison. 🔗 Check out the full thesis for a great deep-dive into building a practical, end-to-end privacy solution: https://www.diva-portal.org/smash/get/diva2:1980696/FULLTEXT01.pdf #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
reacted
to
their
post
with 👀
about 15 hours ago
What if an AI agent could be tricked into stealing your data, just by reading a tool's description? A new paper reports it's possible. The "Attractive Metadata Attack" paper details this stealthy new threat. To measure the real-world impact of their attack, the researchers needed a source of sensitive data for the agent to leak. We're proud that the AI4Privacy corpus was used to create the synthetic user profiles containing standardized PII for their experiments. This is a perfect win-win. Our open-source data helped researchers Kanghua Mo, 龙昱丞, Zhihao Li from Guangzhou University and The Hong Kong Polytechnic University to not just demonstrate a new attack, but also quantify its potential for harm. This data-driven evidence is what pushes the community to build better, execution-level defenses for AI agents. 🔗 Check out their paper to see how easily an agent's trust in tool metadata could be exploited: https://arxiv.org/pdf/2508.02110 #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
reacted
to
their
post
with 👀
about 15 hours ago
What if an AI agent could be tricked into stealing your data, just by reading a tool's description? A new paper reports it's possible. The "Attractive Metadata Attack" paper details this stealthy new threat. To measure the real-world impact of their attack, the researchers needed a source of sensitive data for the agent to leak. We're proud that the AI4Privacy corpus was used to create the synthetic user profiles containing standardized PII for their experiments. This is a perfect win-win. Our open-source data helped researchers Kanghua Mo, 龙昱丞, Zhihao Li from Guangzhou University and The Hong Kong Polytechnic University to not just demonstrate a new attack, but also quantify its potential for harm. This data-driven evidence is what pushes the community to build better, execution-level defenses for AI agents. 🔗 Check out their paper to see how easily an agent's trust in tool metadata could be exploited: https://arxiv.org/pdf/2508.02110 #OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #Worldslargestopensourceprivacymaskingdataset
View all activity
Organizations
MikeDoes
's models
22
Sort: Recently updated
MikeDoes/mmbert-multilingual-20250916-212219
0.3B
•
Updated
Sep 16
•
7
MikeDoes/mmbert-multilingual-20250916-212213
0.1B
•
Updated
Sep 16
•
13
MikeDoes/mmbert-multilingual-20250916-202535
Updated
Sep 16
MikeDoes/mmbert-multilingual-20250916-170430
0.1B
•
Updated
Sep 16
•
10
MikeDoes/mmbert-multilingual-20250916-173350
0.3B
•
Updated
Sep 16
•
6
MikeDoes/mmbert-multilingual-20250916-170450
Updated
Sep 16
MikeDoes/mmbert-multilingual-20250916-155621
0.3B
•
Updated
Sep 16
•
4
MikeDoes/mmbert-multilingual-20250916-155528
Fill-Mask
•
0.1B
•
Updated
Sep 16
•
7
MikeDoes/mmbert-multilingual-20250916-145114
0.3B
•
Updated
Sep 16
•
2
MikeDoes/mmbert-multilingual-20250916-143043
Updated
Sep 16
MikeDoes/mmbert-multilingual-20250916-133611
0.3B
•
Updated
Sep 16
•
5
MikeDoes/mmbert-multilingual-20250916-130537
Fill-Mask
•
0.3B
•
Updated
Sep 16
•
7
MikeDoes/mmbert-multilingual-20250916-120850
Fill-Mask
•
0.3B
•
Updated
Sep 16
•
6
MikeDoes/mmbert-multilingual-20250916-114740
Fill-Mask
•
0.3B
•
Updated
Sep 16
•
6
MikeDoes/mmbert-multilingual-20250916-103748
Fill-Mask
•
0.3B
•
Updated
Sep 16
•
6
MikeDoes/modernbert-english-ner-20250808-034913
Token Classification
•
0.1B
•
Updated
Aug 8
•
4
MikeDoes/modernbert-english-ner-20250806-110517
0.1B
•
Updated
Aug 6
•
3
MikeDoes/quick-ner-model-20250726-011948
Token Classification
•
0.1B
•
Updated
Jul 27
•
5
MikeDoes/eurobert-ner-model-20250726-134739
Token Classification
•
0.2B
•
Updated
Jul 27
•
4
MikeDoes/eurobert-ner-model-20250726-082438
Updated
Jul 26
MikeDoes/quick-ner-model-20250726-004735
Updated
Jul 25
MikeDoes/test_night
Updated
Jul 25