docs: update README and docs with new scrapers and location filter fixes
Browse files- README: add scholarshipdb.net and nature.com/careers to Credits
- README: add new scraper files to project structure tree
- docs: fix stray 'si' characters at start of index.html
- docs: update multi-source card (3 β 5 sources)
- docs: update "Search job boards" step to list all 5 sources
- docs: add scholarshipdb and nature.com/careers source cards
- docs: update architecture tree with new scraper files
- docs: update test count 125 β 156
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- README.md +7 -3
- docs/index.html +22 -10
README.md
CHANGED
|
@@ -94,9 +94,11 @@ PhDScout/
|
|
| 94 |
βββ searcher.py # JobSearcher (orchestrates scrapers)
|
| 95 |
βββ scrapers/
|
| 96 |
βββ base.py # BaseScraper ABC + shared helpers
|
| 97 |
-
βββ euraxess.py
|
| 98 |
-
βββ mlscientist.py
|
| 99 |
-
|
|
|
|
|
|
|
| 100 |
```
|
| 101 |
|
| 102 |
---
|
|
@@ -116,6 +118,8 @@ Job data sourced from:
|
|
| 116 |
- [Euraxess](https://euraxess.ec.europa.eu) β European Commission portal for research careers
|
| 117 |
- [mlscientist.com](https://mlscientist.com) β ML & AI academic job board
|
| 118 |
- [jobs.ac.uk](https://www.jobs.ac.uk) β UK academic jobs portal
|
|
|
|
|
|
|
| 119 |
|
| 120 |
LLM inference powered by [Groq](https://groq.com) free API.
|
| 121 |
|
|
|
|
| 94 |
βββ searcher.py # JobSearcher (orchestrates scrapers)
|
| 95 |
βββ scrapers/
|
| 96 |
βββ base.py # BaseScraper ABC + shared helpers
|
| 97 |
+
βββ euraxess.py # EU/worldwide research portal
|
| 98 |
+
βββ mlscientist.py # ML & AI academic positions
|
| 99 |
+
βββ jobs_ac_uk.py # UK academic jobs (UK/worldwide only)
|
| 100 |
+
βββ scholarshipdb.py # Worldwide aggregator (28k+ positions)
|
| 101 |
+
βββ nature_careers.py # Nature.com/careers β multidisciplinary
|
| 102 |
```
|
| 103 |
|
| 104 |
---
|
|
|
|
| 118 |
- [Euraxess](https://euraxess.ec.europa.eu) β European Commission portal for research careers
|
| 119 |
- [mlscientist.com](https://mlscientist.com) β ML & AI academic job board
|
| 120 |
- [jobs.ac.uk](https://www.jobs.ac.uk) β UK academic jobs portal
|
| 121 |
+
- [scholarshipdb.net](https://scholarshipdb.net) β Worldwide academic jobs and scholarships aggregator
|
| 122 |
+
- [nature.com/careers](https://www.nature.com/naturecareers) β Multidisciplinary global research job board
|
| 123 |
|
| 124 |
LLM inference powered by [Groq](https://groq.com) free API.
|
| 125 |
|
docs/index.html
CHANGED
|
@@ -441,7 +441,7 @@
|
|
| 441 |
<div class="card-sm">
|
| 442 |
<span class="icon-big">π</span>
|
| 443 |
<h4>Multi-source Search</h4>
|
| 444 |
-
<p>
|
| 445 |
</div>
|
| 446 |
<div class="card-sm">
|
| 447 |
<span class="icon-big">π€</span>
|
|
@@ -474,7 +474,7 @@
|
|
| 474 |
<div class="step-num"></div>
|
| 475 |
<div class="step-body">
|
| 476 |
<strong>Search job boards</strong>
|
| 477 |
-
<p>PhdScout queries Euraxess, mlscientist.com,
|
| 478 |
</div>
|
| 479 |
</div>
|
| 480 |
<div class="step">
|
|
@@ -696,17 +696,27 @@ scored = agent.score_jobs(jobs, profile_text)
|
|
| 696 |
<div class="card-sm">
|
| 697 |
<span class="icon-big">πͺπΊ</span>
|
| 698 |
<h4>Euraxess</h4>
|
| 699 |
-
<p>EU/worldwide research portal. Country-filtered.</p>
|
| 700 |
</div>
|
| 701 |
<div class="card-sm">
|
| 702 |
<span class="icon-big">π€</span>
|
| 703 |
<h4>mlscientist.com</h4>
|
| 704 |
-
<p>ML & AI academic positions
|
| 705 |
</div>
|
| 706 |
<div class="card-sm">
|
| 707 |
<span class="icon-big">π¬π§</span>
|
| 708 |
<h4>jobs.ac.uk</h4>
|
| 709 |
-
<p>UK academic jobs. Queried only when UK
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 710 |
</div>
|
| 711 |
</div>
|
| 712 |
|
|
@@ -868,11 +878,13 @@ The letter should be <span class="st">250-350 words (2-3 paragraphs)</span>.</co
|
|
| 868 |
β βββ <span class="dir">search/</span> <span class="note"># Job search infrastructure</span>
|
| 869 |
β βββ <span class="file">searcher.py</span> <span class="note"># JobSearcher (orchestrates scrapers)</span>
|
| 870 |
β βββ <span class="dir">scrapers/</span>
|
| 871 |
-
β βββ <span class="file">base.py</span>
|
| 872 |
-
β βββ <span class="file">euraxess.py</span>
|
| 873 |
-
β βββ <span class="file">mlscientist.py</span>
|
| 874 |
-
β
|
| 875 |
-
|
|
|
|
|
|
|
| 876 |
</div>
|
| 877 |
|
| 878 |
<h2>Pipeline flow</h2>
|
|
|
|
| 441 |
<div class="card-sm">
|
| 442 |
<span class="icon-big">π</span>
|
| 443 |
<h4>Multi-source Search</h4>
|
| 444 |
+
<p>5 job boards searched simultaneously β Europe, worldwide, and country-specific</p>
|
| 445 |
</div>
|
| 446 |
<div class="card-sm">
|
| 447 |
<span class="icon-big">π€</span>
|
|
|
|
| 474 |
<div class="step-num"></div>
|
| 475 |
<div class="step-body">
|
| 476 |
<strong>Search job boards</strong>
|
| 477 |
+
<p>PhdScout queries Euraxess, mlscientist.com, jobs.ac.uk, scholarshipdb.net, and nature.com/careers in parallel, then deduplicates and filters by recency (expired listings discarded).</p>
|
| 478 |
</div>
|
| 479 |
</div>
|
| 480 |
<div class="step">
|
|
|
|
| 696 |
<div class="card-sm">
|
| 697 |
<span class="icon-big">πͺπΊ</span>
|
| 698 |
<h4>Euraxess</h4>
|
| 699 |
+
<p>EU/worldwide research portal. Country-filtered via API parameters.</p>
|
| 700 |
</div>
|
| 701 |
<div class="card-sm">
|
| 702 |
<span class="icon-big">π€</span>
|
| 703 |
<h4>mlscientist.com</h4>
|
| 704 |
+
<p>ML & AI academic positions. 14 country categories supported.</p>
|
| 705 |
</div>
|
| 706 |
<div class="card-sm">
|
| 707 |
<span class="icon-big">π¬π§</span>
|
| 708 |
<h4>jobs.ac.uk</h4>
|
| 709 |
+
<p>UK academic jobs. Queried only when UK or Worldwide is selected.</p>
|
| 710 |
+
</div>
|
| 711 |
+
<div class="card-sm">
|
| 712 |
+
<span class="icon-big">π</span>
|
| 713 |
+
<h4>scholarshipdb.net</h4>
|
| 714 |
+
<p>Worldwide aggregator with 28k+ positions across all disciplines. Country-filtered via URL path.</p>
|
| 715 |
+
</div>
|
| 716 |
+
<div class="card-sm">
|
| 717 |
+
<span class="icon-big">π¬</span>
|
| 718 |
+
<h4>nature.com/careers</h4>
|
| 719 |
+
<p>Multidisciplinary global board. Keyword search + ISO country code filtering.</p>
|
| 720 |
</div>
|
| 721 |
</div>
|
| 722 |
|
|
|
|
| 878 |
β βββ <span class="dir">search/</span> <span class="note"># Job search infrastructure</span>
|
| 879 |
β βββ <span class="file">searcher.py</span> <span class="note"># JobSearcher (orchestrates scrapers)</span>
|
| 880 |
β βββ <span class="dir">scrapers/</span>
|
| 881 |
+
β βββ <span class="file">base.py</span> <span class="note"># BaseScraper ABC + shared helpers</span>
|
| 882 |
+
β βββ <span class="file">euraxess.py</span> <span class="note"># EU/worldwide research portal</span>
|
| 883 |
+
β βββ <span class="file">mlscientist.py</span> <span class="note"># ML & AI academic positions</span>
|
| 884 |
+
β βββ <span class="file">jobs_ac_uk.py</span> <span class="note"># UK academic jobs (UK/worldwide only)</span>
|
| 885 |
+
β βββ <span class="file">scholarshipdb.py</span> <span class="note"># Worldwide aggregator (28k+ positions)</span>
|
| 886 |
+
β βββ <span class="file">nature_careers.py</span> <span class="note"># nature.com/careers β multidisciplinary</span>
|
| 887 |
+
βββ <span class="dir">tests/</span> <span class="note"># 156 unit tests (pytest)</span>
|
| 888 |
</div>
|
| 889 |
|
| 890 |
<h2>Pipeline flow</h2>
|