File size: 1,744 Bytes
292d92c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Scan repository for personally identifiable information (PII) before open sourcing.

Your task:
1. Proactively scan for subtle forms of PII:
   - Personal email addresses
   - Phone numbers
   - Physical addresses
   - Names in comments or documentation
   - Social media handles
   - IP addresses
   - MAC addresses
   - Personal identifiers in test data
   - Usernames that might be personally identifying
   - Organization-specific information

2. Assume API key protection is already handled, focus on:
   - Developer names in code comments
   - Email addresses in git commits (use git log)
   - Personal information in documentation
   - Internal URLs or server names
   - Employee IDs or internal identifiers
   - Customer data in examples
   - Screenshots with personal info

3. Do NOT remediate automatically:
   - Only identify and report findings
   - Do not delete, change, or obfuscate without permission
   - Present comprehensive list to user

4. Generate detailed report:
   ```markdown
   ## PII Scan Report

   ### High Risk
   - File: src/config.js:45
     Type: Email address
     Content: [REDACTED]@company.com
     Context: Developer email in comment

   ### Medium Risk
   - File: docs/api.md:12
     Type: Internal URL
     Content: https://internal-server.local
     Context: Example API endpoint

   ### Low Risk
   - File: README.md:34
     Type: Username
     Content: @john_developer
     Context: Acknowledgments section

   ## Recommendations
   1. Replace developer emails with generic project email
   2. Use example.com for URL examples
   3. Consider anonymizing usernames
   ```

5. Seek user advice on remediation approach

Help users identify PII they may wish to remove before making repositories public.