File size: 7,192 Bytes
5f5806d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
# Profession Categories for Deepfake Adapter Classification
## Overview
The LLM annotation uses **9 specific profession categories** to classify people found in deepfake adapter datasets. These categories cover the most common types of public figures targeted by deepfake technology.
## The 9 Categories
### 1. **actor**
Film, TV, and theater performers who primarily work in scripted dramatic content.
**Examples:**
- Emma Watson (film actor)
- Scarlett Johansson (film actor)
- Bryan Cranston (TV actor)
**Keywords:** actor, actress, film, movie, cinema, theatrical, performer, drama
---
### 2. **adult performer**
People working in the adult entertainment industry.
**Examples:**
- Adult film actors
- OnlyFans creators
- Cam models
**Keywords:** adult entertainer, onlyfans, cam model, camgirl, webcam, pornographic actor, porn, pornstar
**Note:** This category is important for research on unauthorized deepfake usage, which disproportionately affects adult performers.
---
### 3. **singer/musician**
Vocalists, instrumentalists, and music performers across all genres.
**Examples:**
- IU (K-pop singer)
- Taylor Swift (singer/songwriter)
- DJ Khaled (DJ/producer)
**Keywords:** singer, musician, rapper, rap artist, band, vocalist, songwriter, composer, DJ, producer, kpop, jpop
---
### 4. **model**
Fashion, runway, and photoshoot models.
**Examples:**
- Gigi Hadid (fashion model)
- Tyra Banks (supermodel)
- Ashley Graham (plus-size model)
**Keywords:** model, fashion, runway, photoshoot, supermodel
---
### 5. **online personality**
Digital content creators, streamers, and influencers.
**Includes:**
- Streamers (Twitch, YouTube)
- Cosplayers
- YouTubers
- Instagram influencers
- Content creators
- E-girls/E-boys
- Gaming personalities
**Examples:**
- Belle Delphine (online personality, cosplayer)
- Pokimane (Twitch streamer)
- MrBeast (YouTuber)
**Keywords:** influencer, streamer, twitch, youtuber, youtube, content creator, instagrammer, instagram, e-girl, egirl, e-boy, eboy, cosplayer, gamer
---
### 6. **public figure**
Politicians, activists, journalists, authors, and other public-facing professionals not in entertainment.
**Examples:**
- Barack Obama (politician)
- Greta Thunberg (activist)
- Gordon Ramsay (chef)
- J.K. Rowling (author)
**Keywords:** politician, activist, public speaker, journalist, author, writer, chef, famous, celebrity (when not in entertainment)
---
### 7. **voice actor/ASMR**
Voice performers for animation, games, and ASMR content creators.
**Examples:**
- Tara Strong (voice actress)
- Troy Baker (voice actor)
- ASMR artists
**Keywords:** voice actor, voice actress, asmr creator, asmr
---
### 8. **sports professional**
Professional athletes and sports competitors.
**Examples:**
- Cristiano Ronaldo (soccer player)
- Serena Williams (tennis player)
- LeBron James (basketball player)
**Keywords:** athlete, sports, player, professional, competitor, olympian
---
### 9. **tv personality**
TV hosts, presenters, reality TV stars, and broadcast personalities.
**Examples:**
- Oprah Winfrey (talk show host)
- Kim Kardashian (reality TV)
- Jimmy Fallon (late night host)
**Keywords:** tv host, tv moderator, talk show host, talkshow, radio host, media personality, reality tv star, reality, comedian, presenter, broadcaster, anchor
---
## Multi-Category Classification
Many public figures work across multiple categories. The LLM can assign **up to 3 professions** per person, ordered by relevance.
### Examples:
| Person | Professions | Explanation |
|--------|-------------|-------------|
| **Emma Watson** | `actor, public figure` | Primarily an actor, but also known for activism |
| **IU (Lee Ji-eun)** | `singer/musician, actor` | K-pop singer who also acts in TV dramas |
| **Belle Delphine** | `online personality, adult performer` | Internet personality with adult content |
| **Jamie Foxx** | `actor, singer/musician` | Actor who also has a music career |
| **Dwayne Johnson** | `actor, sports professional` | Former wrestler, now primarily an actor |
## Category Selection Guidelines
### For the LLM:
1. **Choose most specific category first**
- ✅ `"actor"` for film performers
- ❌ `"public figure"` for someone primarily known as an actor
2. **Order by relevance**
- Most important role first
- Secondary roles after
- Maximum 3 categories
3. **Be inclusive for online personalities**
- Streamers → `"online personality"`
- Cosplayers → `"online personality"`
- Influencers → `"online personality"`
4. **Distinguish TV personalities from actors**
- ✅ `"tv personality"` for talk show hosts
- ✅ `"actor"` for scripted TV drama performers
- Some can be both!
## Why These 9 Categories?
These categories were chosen based on:
1. **Common targets of deepfake technology**
- Celebrities and public figures are most frequently deepfaked
- Adult performers are disproportionately affected
2. **Clear distinctions**
- Each category represents a distinct professional domain
- Minimal overlap between categories
3. **Research relevance**
- Important for analyzing demographic patterns in deepfake usage
- Helps understand which professions are most at risk
4. **Comprehensive coverage**
- Covers the vast majority of people in the deepfake adapter dataset
- Includes both traditional and digital-native celebrities
## Output Format
The LLM returns professions as a **comma-separated list**:
```
"singer/musician, actor"
"online personality"
"actor, public figure, online personality"
"sports professional"
```
## Validation
After annotation, we can analyze:
- Distribution across categories
- Multi-category patterns
- Correlation with countries/regions
- Demographic patterns
Example analysis:
```python
# Count by category
df['profession_llm'].value_counts()
# Most common combinations
df['profession_llm'].value_counts().head(20)
# Filter by specific category
actors = df[df['profession_llm'].str.contains('actor', na=False)]
```
## Edge Cases
### Fictional Characters
**Handling:** Return `"Unknown"` for all fields
**Example:**
- Input: `"Elsa from Frozen"`
- Output: `profession_llm: "Unknown"`
### Unclear/Ambiguous Cases
**Handling:** Use best judgment based on primary public recognition
**Example:**
- Input: `"Elon Musk"`
- Primary recognition: Business/technology
- Category: `"public figure"`
### Multiple Equally Important Roles
**Handling:** List all relevant categories (up to 3)
**Example:**
- Input: `"Donald Glover / Childish Gambino"`
- Categories: `"actor, singer/musician, tv personality"`
## Notes for Researchers
When analyzing the annotated data:
1. **Category frequency** indicates which professions are most targeted
2. **Multi-category entries** show crossover between industries
3. **Online personality prevalence** indicates rise of digital-native celebrities in deepfakes
4. **Adult performer numbers** highlight ongoing issues with non-consensual deepfakes
## Version History
- **v1.0** (Current): Initial 9-category system
- Refined from original detailed profession list
- Focused on clear, distinct categories
- Added comprehensive documentation
|