| --- |
| language: en |
| license: mit |
| tags: |
| - classification |
| - lobbying |
| - linkedin |
| datasets: custom |
| --- |
| |
| # Lobbyist classifier (English (US)) |
|
|
| Binary sequence classifier fine-tuned to predict whether a LinkedIn-style job position (title + employer + description) corresponds to a **lobbyist** (1) or not (0). Trained for the project "Who Becomes a Lobbyist?" (MINISTERIALLOBBY) on Revelio/LinkedIn position text, with labels from the German Bundestag lobby register (DE) or LobbyView (US). |
|
|
| - **Base model:** `distilbert-base-uncased` |
| - **Task:** Sequence classification (2 labels: non-lobbyist, lobbyist) |
| - **Max length:** 256 tokens |
|
|
| ## Evaluation (5-fold CV) |
|
|
| - Mean F1: 0.8942 (± 0.0025) |
| - Fold F1 scores: [0.8954220915581689, 0.891170431211499, 0.8943089430894309, 0.8919135308246597, 0.8982282653481665] |
| - Training samples: 12834 (positive: 6417) |
|
|
| ## Intended use |
|
|
| - Research: classify past or current job positions as lobby vs non-lobby for career-path and panel analyses. |
| - Not for commercial use without checking compliance with LinkedIn/Revelio terms. |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| import torch |
| |
| repo_id = "cornelius/lobbyist-classifier-us" |
| tokenizer = AutoTokenizer.from_pretrained(repo_id) |
| model = AutoModelForSequenceClassification.from_pretrained(repo_id) |
| |
| def predict(texts, threshold=0.95): |
| inp = tokenizer(texts, truncation=True, max_length=256, padding="max_length", return_tensors="pt") |
| with torch.no_grad(): |
| logits = model(**inp).logits |
| probs = torch.softmax(logits, dim=1) |
| return probs[:, 1].numpy() # prob lobbyist |
| |
| # Single position: title + " " + company + " " + description |
| text = "Senior Public Affairs Manager Acme Corp Government relations and advocacy." |
| prob = predict([text])[0] |
| print(f"P(lobbyist) = {prob:.2f}") |
| ``` |
|
|
| ## Citation |
|
|
| If you use this model, please cite the paper "Who Becomes a Lobbyist? Comparative Evidence from the US and Germany" (MINISTERIALLOBBY project, DFG). |