Profession Categories for Deepfake Adapter Classification
Overview
The LLM annotation uses 9 specific profession categories to classify people found in deepfake adapter datasets. These categories cover the most common types of public figures targeted by deepfake technology.
The 9 Categories
1. actor
Film, TV, and theater performers who primarily work in scripted dramatic content.
Examples:
- Emma Watson (film actor)
- Scarlett Johansson (film actor)
- Bryan Cranston (TV actor)
Keywords: actor, actress, film, movie, cinema, theatrical, performer, drama
2. adult performer
People working in the adult entertainment industry.
Examples:
- Adult film actors
- OnlyFans creators
- Cam models
Keywords: adult entertainer, onlyfans, cam model, camgirl, webcam, pornographic actor, porn, pornstar
Note: This category is important for research on unauthorized deepfake usage, which disproportionately affects adult performers.
3. singer/musician
Vocalists, instrumentalists, and music performers across all genres.
Examples:
- IU (K-pop singer)
- Taylor Swift (singer/songwriter)
- DJ Khaled (DJ/producer)
Keywords: singer, musician, rapper, rap artist, band, vocalist, songwriter, composer, DJ, producer, kpop, jpop
4. model
Fashion, runway, and photoshoot models.
Examples:
- Gigi Hadid (fashion model)
- Tyra Banks (supermodel)
- Ashley Graham (plus-size model)
Keywords: model, fashion, runway, photoshoot, supermodel
5. online personality
Digital content creators, streamers, and influencers.
Includes:
- Streamers (Twitch, YouTube)
- Cosplayers
- YouTubers
- Instagram influencers
- Content creators
- E-girls/E-boys
- Gaming personalities
Examples:
- Belle Delphine (online personality, cosplayer)
- Pokimane (Twitch streamer)
- MrBeast (YouTuber)
Keywords: influencer, streamer, twitch, youtuber, youtube, content creator, instagrammer, instagram, e-girl, egirl, e-boy, eboy, cosplayer, gamer
6. public figure
Politicians, activists, journalists, authors, and other public-facing professionals not in entertainment.
Examples:
- Barack Obama (politician)
- Greta Thunberg (activist)
- Gordon Ramsay (chef)
- J.K. Rowling (author)
Keywords: politician, activist, public speaker, journalist, author, writer, chef, famous, celebrity (when not in entertainment)
7. voice actor/ASMR
Voice performers for animation, games, and ASMR content creators.
Examples:
- Tara Strong (voice actress)
- Troy Baker (voice actor)
- ASMR artists
Keywords: voice actor, voice actress, asmr creator, asmr
8. sports professional
Professional athletes and sports competitors.
Examples:
- Cristiano Ronaldo (soccer player)
- Serena Williams (tennis player)
- LeBron James (basketball player)
Keywords: athlete, sports, player, professional, competitor, olympian
9. tv personality
TV hosts, presenters, reality TV stars, and broadcast personalities.
Examples:
- Oprah Winfrey (talk show host)
- Kim Kardashian (reality TV)
- Jimmy Fallon (late night host)
Keywords: tv host, tv moderator, talk show host, talkshow, radio host, media personality, reality tv star, reality, comedian, presenter, broadcaster, anchor
Multi-Category Classification
Many public figures work across multiple categories. The LLM can assign up to 3 professions per person, ordered by relevance.
Examples:
| Person | Professions | Explanation |
|---|---|---|
| Emma Watson | actor, public figure |
Primarily an actor, but also known for activism |
| IU (Lee Ji-eun) | singer/musician, actor |
K-pop singer who also acts in TV dramas |
| Belle Delphine | online personality, adult performer |
Internet personality with adult content |
| Jamie Foxx | actor, singer/musician |
Actor who also has a music career |
| Dwayne Johnson | actor, sports professional |
Former wrestler, now primarily an actor |
Category Selection Guidelines
For the LLM:
Choose most specific category first
- ✅
"actor"for film performers - ❌
"public figure"for someone primarily known as an actor
- ✅
Order by relevance
- Most important role first
- Secondary roles after
- Maximum 3 categories
Be inclusive for online personalities
- Streamers →
"online personality" - Cosplayers →
"online personality" - Influencers →
"online personality"
- Streamers →
Distinguish TV personalities from actors
- ✅
"tv personality"for talk show hosts - ✅
"actor"for scripted TV drama performers - Some can be both!
- ✅
Why These 9 Categories?
These categories were chosen based on:
Common targets of deepfake technology
- Celebrities and public figures are most frequently deepfaked
- Adult performers are disproportionately affected
Clear distinctions
- Each category represents a distinct professional domain
- Minimal overlap between categories
Research relevance
- Important for analyzing demographic patterns in deepfake usage
- Helps understand which professions are most at risk
Comprehensive coverage
- Covers the vast majority of people in the deepfake adapter dataset
- Includes both traditional and digital-native celebrities
Output Format
The LLM returns professions as a comma-separated list:
"singer/musician, actor"
"online personality"
"actor, public figure, online personality"
"sports professional"
Validation
After annotation, we can analyze:
- Distribution across categories
- Multi-category patterns
- Correlation with countries/regions
- Demographic patterns
Example analysis:
# Count by category
df['profession_llm'].value_counts()
# Most common combinations
df['profession_llm'].value_counts().head(20)
# Filter by specific category
actors = df[df['profession_llm'].str.contains('actor', na=False)]
Edge Cases
Fictional Characters
Handling: Return "Unknown" for all fields
Example:
- Input:
"Elsa from Frozen" - Output:
profession_llm: "Unknown"
Unclear/Ambiguous Cases
Handling: Use best judgment based on primary public recognition
Example:
- Input:
"Elon Musk" - Primary recognition: Business/technology
- Category:
"public figure"
Multiple Equally Important Roles
Handling: List all relevant categories (up to 3)
Example:
- Input:
"Donald Glover / Childish Gambino" - Categories:
"actor, singer/musician, tv personality"
Notes for Researchers
When analyzing the annotated data:
- Category frequency indicates which professions are most targeted
- Multi-category entries show crossover between industries
- Online personality prevalence indicates rise of digital-native celebrities in deepfakes
- Adult performer numbers highlight ongoing issues with non-consensual deepfakes
Version History
- v1.0 (Current): Initial 9-category system
- Refined from original detailed profession list
- Focused on clear, distinct categories
- Added comprehensive documentation