# Profession Categories for Deepfake Adapter Classification ## Overview The LLM annotation uses **9 specific profession categories** to classify people found in deepfake adapter datasets. These categories cover the most common types of public figures targeted by deepfake technology. ## The 9 Categories ### 1. **actor** Film, TV, and theater performers who primarily work in scripted dramatic content. **Examples:** - Emma Watson (film actor) - Scarlett Johansson (film actor) - Bryan Cranston (TV actor) **Keywords:** actor, actress, film, movie, cinema, theatrical, performer, drama --- ### 2. **adult performer** People working in the adult entertainment industry. **Examples:** - Adult film actors - OnlyFans creators - Cam models **Keywords:** adult entertainer, onlyfans, cam model, camgirl, webcam, pornographic actor, porn, pornstar **Note:** This category is important for research on unauthorized deepfake usage, which disproportionately affects adult performers. --- ### 3. **singer/musician** Vocalists, instrumentalists, and music performers across all genres. **Examples:** - IU (K-pop singer) - Taylor Swift (singer/songwriter) - DJ Khaled (DJ/producer) **Keywords:** singer, musician, rapper, rap artist, band, vocalist, songwriter, composer, DJ, producer, kpop, jpop --- ### 4. **model** Fashion, runway, and photoshoot models. **Examples:** - Gigi Hadid (fashion model) - Tyra Banks (supermodel) - Ashley Graham (plus-size model) **Keywords:** model, fashion, runway, photoshoot, supermodel --- ### 5. **online personality** Digital content creators, streamers, and influencers. **Includes:** - Streamers (Twitch, YouTube) - Cosplayers - YouTubers - Instagram influencers - Content creators - E-girls/E-boys - Gaming personalities **Examples:** - Belle Delphine (online personality, cosplayer) - Pokimane (Twitch streamer) - MrBeast (YouTuber) **Keywords:** influencer, streamer, twitch, youtuber, youtube, content creator, instagrammer, instagram, e-girl, egirl, e-boy, eboy, cosplayer, gamer --- ### 6. **public figure** Politicians, activists, journalists, authors, and other public-facing professionals not in entertainment. **Examples:** - Barack Obama (politician) - Greta Thunberg (activist) - Gordon Ramsay (chef) - J.K. Rowling (author) **Keywords:** politician, activist, public speaker, journalist, author, writer, chef, famous, celebrity (when not in entertainment) --- ### 7. **voice actor/ASMR** Voice performers for animation, games, and ASMR content creators. **Examples:** - Tara Strong (voice actress) - Troy Baker (voice actor) - ASMR artists **Keywords:** voice actor, voice actress, asmr creator, asmr --- ### 8. **sports professional** Professional athletes and sports competitors. **Examples:** - Cristiano Ronaldo (soccer player) - Serena Williams (tennis player) - LeBron James (basketball player) **Keywords:** athlete, sports, player, professional, competitor, olympian --- ### 9. **tv personality** TV hosts, presenters, reality TV stars, and broadcast personalities. **Examples:** - Oprah Winfrey (talk show host) - Kim Kardashian (reality TV) - Jimmy Fallon (late night host) **Keywords:** tv host, tv moderator, talk show host, talkshow, radio host, media personality, reality tv star, reality, comedian, presenter, broadcaster, anchor --- ## Multi-Category Classification Many public figures work across multiple categories. The LLM can assign **up to 3 professions** per person, ordered by relevance. ### Examples: | Person | Professions | Explanation | |--------|-------------|-------------| | **Emma Watson** | `actor, public figure` | Primarily an actor, but also known for activism | | **IU (Lee Ji-eun)** | `singer/musician, actor` | K-pop singer who also acts in TV dramas | | **Belle Delphine** | `online personality, adult performer` | Internet personality with adult content | | **Jamie Foxx** | `actor, singer/musician` | Actor who also has a music career | | **Dwayne Johnson** | `actor, sports professional` | Former wrestler, now primarily an actor | ## Category Selection Guidelines ### For the LLM: 1. **Choose most specific category first** - ✅ `"actor"` for film performers - ❌ `"public figure"` for someone primarily known as an actor 2. **Order by relevance** - Most important role first - Secondary roles after - Maximum 3 categories 3. **Be inclusive for online personalities** - Streamers → `"online personality"` - Cosplayers → `"online personality"` - Influencers → `"online personality"` 4. **Distinguish TV personalities from actors** - ✅ `"tv personality"` for talk show hosts - ✅ `"actor"` for scripted TV drama performers - Some can be both! ## Why These 9 Categories? These categories were chosen based on: 1. **Common targets of deepfake technology** - Celebrities and public figures are most frequently deepfaked - Adult performers are disproportionately affected 2. **Clear distinctions** - Each category represents a distinct professional domain - Minimal overlap between categories 3. **Research relevance** - Important for analyzing demographic patterns in deepfake usage - Helps understand which professions are most at risk 4. **Comprehensive coverage** - Covers the vast majority of people in the deepfake adapter dataset - Includes both traditional and digital-native celebrities ## Output Format The LLM returns professions as a **comma-separated list**: ``` "singer/musician, actor" "online personality" "actor, public figure, online personality" "sports professional" ``` ## Validation After annotation, we can analyze: - Distribution across categories - Multi-category patterns - Correlation with countries/regions - Demographic patterns Example analysis: ```python # Count by category df['profession_llm'].value_counts() # Most common combinations df['profession_llm'].value_counts().head(20) # Filter by specific category actors = df[df['profession_llm'].str.contains('actor', na=False)] ``` ## Edge Cases ### Fictional Characters **Handling:** Return `"Unknown"` for all fields **Example:** - Input: `"Elsa from Frozen"` - Output: `profession_llm: "Unknown"` ### Unclear/Ambiguous Cases **Handling:** Use best judgment based on primary public recognition **Example:** - Input: `"Elon Musk"` - Primary recognition: Business/technology - Category: `"public figure"` ### Multiple Equally Important Roles **Handling:** List all relevant categories (up to 3) **Example:** - Input: `"Donald Glover / Childish Gambino"` - Categories: `"actor, singer/musician, tv personality"` ## Notes for Researchers When analyzing the annotated data: 1. **Category frequency** indicates which professions are most targeted 2. **Multi-category entries** show crossover between industries 3. **Online personality prevalence** indicates rise of digital-native celebrities in deepfakes 4. **Adult performer numbers** highlight ongoing issues with non-consensual deepfakes ## Version History - **v1.0** (Current): Initial 9-category system - Refined from original detailed profession list - Focused on clear, distinct categories - Added comprehensive documentation