File size: 7,192 Bytes
5f5806d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
# Profession Categories for Deepfake Adapter Classification

## Overview

The LLM annotation uses **9 specific profession categories** to classify people found in deepfake adapter datasets. These categories cover the most common types of public figures targeted by deepfake technology.

## The 9 Categories

### 1. **actor**
Film, TV, and theater performers who primarily work in scripted dramatic content.

**Examples:**
- Emma Watson (film actor)
- Scarlett Johansson (film actor)
- Bryan Cranston (TV actor)

**Keywords:** actor, actress, film, movie, cinema, theatrical, performer, drama

---

### 2. **adult performer**
People working in the adult entertainment industry.

**Examples:**
- Adult film actors
- OnlyFans creators
- Cam models

**Keywords:** adult entertainer, onlyfans, cam model, camgirl, webcam, pornographic actor, porn, pornstar

**Note:** This category is important for research on unauthorized deepfake usage, which disproportionately affects adult performers.

---

### 3. **singer/musician**
Vocalists, instrumentalists, and music performers across all genres.

**Examples:**
- IU (K-pop singer)
- Taylor Swift (singer/songwriter)
- DJ Khaled (DJ/producer)

**Keywords:** singer, musician, rapper, rap artist, band, vocalist, songwriter, composer, DJ, producer, kpop, jpop

---

### 4. **model**
Fashion, runway, and photoshoot models.

**Examples:**
- Gigi Hadid (fashion model)
- Tyra Banks (supermodel)
- Ashley Graham (plus-size model)

**Keywords:** model, fashion, runway, photoshoot, supermodel

---

### 5. **online personality**
Digital content creators, streamers, and influencers.

**Includes:**
- Streamers (Twitch, YouTube)
- Cosplayers
- YouTubers
- Instagram influencers
- Content creators
- E-girls/E-boys
- Gaming personalities

**Examples:**
- Belle Delphine (online personality, cosplayer)
- Pokimane (Twitch streamer)
- MrBeast (YouTuber)

**Keywords:** influencer, streamer, twitch, youtuber, youtube, content creator, instagrammer, instagram, e-girl, egirl, e-boy, eboy, cosplayer, gamer

---

### 6. **public figure**
Politicians, activists, journalists, authors, and other public-facing professionals not in entertainment.

**Examples:**
- Barack Obama (politician)
- Greta Thunberg (activist)
- Gordon Ramsay (chef)
- J.K. Rowling (author)

**Keywords:** politician, activist, public speaker, journalist, author, writer, chef, famous, celebrity (when not in entertainment)

---

### 7. **voice actor/ASMR**
Voice performers for animation, games, and ASMR content creators.

**Examples:**
- Tara Strong (voice actress)
- Troy Baker (voice actor)
- ASMR artists

**Keywords:** voice actor, voice actress, asmr creator, asmr

---

### 8. **sports professional**
Professional athletes and sports competitors.

**Examples:**
- Cristiano Ronaldo (soccer player)
- Serena Williams (tennis player)
- LeBron James (basketball player)

**Keywords:** athlete, sports, player, professional, competitor, olympian

---

### 9. **tv personality**
TV hosts, presenters, reality TV stars, and broadcast personalities.

**Examples:**
- Oprah Winfrey (talk show host)
- Kim Kardashian (reality TV)
- Jimmy Fallon (late night host)

**Keywords:** tv host, tv moderator, talk show host, talkshow, radio host, media personality, reality tv star, reality, comedian, presenter, broadcaster, anchor

---

## Multi-Category Classification

Many public figures work across multiple categories. The LLM can assign **up to 3 professions** per person, ordered by relevance.

### Examples:

| Person | Professions | Explanation |
|--------|-------------|-------------|
| **Emma Watson** | `actor, public figure` | Primarily an actor, but also known for activism |
| **IU (Lee Ji-eun)** | `singer/musician, actor` | K-pop singer who also acts in TV dramas |
| **Belle Delphine** | `online personality, adult performer` | Internet personality with adult content |
| **Jamie Foxx** | `actor, singer/musician` | Actor who also has a music career |
| **Dwayne Johnson** | `actor, sports professional` | Former wrestler, now primarily an actor |

## Category Selection Guidelines

### For the LLM:

1. **Choose most specific category first**
   -`"actor"` for film performers
   -`"public figure"` for someone primarily known as an actor

2. **Order by relevance**
   - Most important role first
   - Secondary roles after
   - Maximum 3 categories

3. **Be inclusive for online personalities**
   - Streamers → `"online personality"`
   - Cosplayers → `"online personality"`
   - Influencers → `"online personality"`

4. **Distinguish TV personalities from actors**
   -`"tv personality"` for talk show hosts
   -`"actor"` for scripted TV drama performers
   - Some can be both!

## Why These 9 Categories?

These categories were chosen based on:

1. **Common targets of deepfake technology**
   - Celebrities and public figures are most frequently deepfaked
   - Adult performers are disproportionately affected

2. **Clear distinctions**
   - Each category represents a distinct professional domain
   - Minimal overlap between categories

3. **Research relevance**
   - Important for analyzing demographic patterns in deepfake usage
   - Helps understand which professions are most at risk

4. **Comprehensive coverage**
   - Covers the vast majority of people in the deepfake adapter dataset
   - Includes both traditional and digital-native celebrities

## Output Format

The LLM returns professions as a **comma-separated list**:

```
"singer/musician, actor"
"online personality"
"actor, public figure, online personality"
"sports professional"
```

## Validation

After annotation, we can analyze:
- Distribution across categories
- Multi-category patterns
- Correlation with countries/regions
- Demographic patterns

Example analysis:
```python
# Count by category
df['profession_llm'].value_counts()

# Most common combinations
df['profession_llm'].value_counts().head(20)

# Filter by specific category
actors = df[df['profession_llm'].str.contains('actor', na=False)]
```

## Edge Cases

### Fictional Characters
**Handling:** Return `"Unknown"` for all fields

**Example:**
- Input: `"Elsa from Frozen"`
- Output: `profession_llm: "Unknown"`

### Unclear/Ambiguous Cases
**Handling:** Use best judgment based on primary public recognition

**Example:**
- Input: `"Elon Musk"`
- Primary recognition: Business/technology
- Category: `"public figure"`

### Multiple Equally Important Roles
**Handling:** List all relevant categories (up to 3)

**Example:**
- Input: `"Donald Glover / Childish Gambino"`
- Categories: `"actor, singer/musician, tv personality"`

## Notes for Researchers

When analyzing the annotated data:

1. **Category frequency** indicates which professions are most targeted
2. **Multi-category entries** show crossover between industries
3. **Online personality prevalence** indicates rise of digital-native celebrities in deepfakes
4. **Adult performer numbers** highlight ongoing issues with non-consensual deepfakes

## Version History

- **v1.0** (Current): Initial 9-category system
  - Refined from original detailed profession list
  - Focused on clear, distinct categories
  - Added comprehensive documentation