title: ISP Handbook Engine
emoji: π
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
ISP Handbook Service β Python Migration
A Python/FastAPI service that generates the ISP (International Scholars Program) Handbook as PDF or HTML. This is a drop-in replacement for the PHP handbook generation pipeline, designed to be called over HTTP from the existing PHP application.
Architecture
python_service/
βββ app/
β βββ main.py # FastAPI entry point
β βββ api/
β β βββ routes.py # REST endpoints
β βββ core/
β β βββ config.py # Environment-based settings
β β βββ database.py # SQLAlchemy engine (MySQL)
β β βββ fonts.py # Century Gothic font management
β β βββ logging.py # Logging setup
β βββ models/ # SQLAlchemy models (if needed)
β βββ repositories/
β β βββ handbook_repo.py # Direct DB access (fallback)
β βββ schemas/
β β βββ handbook.py # Pydantic request/response models
β βββ services/
β βββ data_fetcher.py # Fetch data from external JSON APIs
β βββ html_builder.py # Build full handbook HTML
β βββ pdf_service.py # HTML -> PDF via WeasyPrint
β βββ renderers.py # TOC, sections, university renderers
β βββ utils.py # Shared helpers (h, money format, etc.)
βββ tests/
β βββ test_api.py
β βββ test_renderers.py
βββ fonts/ # Century Gothic TTF files
βββ images/ # Handbook images (cover, header, etc.)
βββ css/ # Base stylesheet
βββ Dockerfile
βββ requirements.txt
βββ .env.example
βββ README.md
API Endpoints
| Method | Path | Description |
|---|---|---|
GET |
/health |
Health check |
GET |
/diagnostics/fonts |
Font file diagnostics |
GET |
/api/v1/sections/global?catalog_id=0 |
Fetch normalised global sections |
GET |
/api/v1/sections/universities |
Fetch normalised university sections |
GET |
/api/v1/handbook/pdf?catalog_id=0 |
Generate PDF (download) |
POST |
/api/v1/handbook/pdf |
Generate PDF with JSON body |
GET |
/api/v1/handbook/html?catalog_id=0 |
Generate HTML preview |
POST |
/api/v1/handbook/render |
Generate PDF or HTML based on output_format |
GET |
/docs |
Swagger UI |
GET |
/redoc |
ReDoc UI |
Local Development
Prerequisites
- Python 3.11+
- MySQL database (existing schema β unchanged)
- Century Gothic font files in
fonts/directory
Setup
cd python_service
# Create virtualenv
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txt
# Copy and configure environment
copy .env.example .env
# Edit .env with your database credentials and API URLs
Run
uvicorn app.main:app --reload --host 0.0.0.0 --port 7860
Visit http://localhost:7860/docs for the interactive API documentation.
Run Tests
pytest tests/ -v
Docker
Build
docker build -t isp-handbook-service .
Run
docker run -d \
--name handbook-service \
-p 7860:7860 \
-e DB_HOST=host.docker.internal \
-e DB_USER=root \
-e DB_PASSWORD=secret \
-e DB_NAME=handbook \
-e API_BASE_URL=https://finsapdev.qhtestingserver.com \
isp-handbook-service
Or with an env file:
docker run -d --name handbook-service -p 7860:7860 --env-file .env isp-handbook-service
Hugging Face Spaces Deployment
- Create a new Space on Hugging Face with Docker SDK
- Upload/push the
python_service/directory as the Space root - Ensure
fonts/,images/, andcss/directories are included - Set environment variables (Secrets) in Space settings:
DB_HOST,DB_USER,DB_PASSWORD,DB_NAMEAPI_BASE_URLPORT=7860(default for HF Spaces)
- The
Dockerfileis already configured for HF Spaces (port 7860,0.0.0.0)
Important: Hugging Face Spaces may not allow outbound MySQL connections. If direct DB access is needed, use the external API endpoint approach (the service fetches data from the PHP JSON APIs over HTTP, not from the database directly).
PHP Integration Example
The PHP application can call this Python service over HTTP using cURL:
<?php
/**
* PHP client for the ISP Handbook Python Service.
* Replace HANDBOOK_SERVICE_URL with your actual deployment URL.
*/
define('HANDBOOK_SERVICE_URL', 'http://localhost:7860');
/**
* Check service health.
*/
function handbook_health(): array {
$url = HANDBOOK_SERVICE_URL . '/health';
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_TIMEOUT => 5,
]);
$body = curl_exec($ch);
$code = (int)curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($code !== 200) {
return ['ok' => false, 'error' => 'Service unreachable', 'http_code' => $code];
}
return json_decode($body, true) ?? ['ok' => false, 'error' => 'Invalid response'];
}
/**
* Generate and download the handbook PDF.
*/
function handbook_download_pdf(int $catalogId = 0, bool $debug = false): void {
$params = http_build_query([
'catalog_id' => $catalogId,
'debug' => $debug ? 'true' : 'false',
]);
$url = HANDBOOK_SERVICE_URL . '/api/v1/handbook/pdf?' . $params;
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_TIMEOUT => 120,
CURLOPT_FOLLOWLOCATION => true,
]);
$body = curl_exec($ch);
$code = (int)curl_getinfo($ch, CURLINFO_HTTP_CODE);
$contentType = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
curl_close($ch);
if ($code !== 200 || strpos($contentType, 'application/pdf') === false) {
http_response_code(502);
header('Content-Type: text/plain');
echo "PDF generation failed (HTTP $code)";
return;
}
header('Content-Type: application/pdf');
header('Content-Disposition: attachment; filename="ISP_Handbook.pdf"');
header('Content-Length: ' . strlen($body));
echo $body;
}
/**
* Fetch global sections via the Python service.
*/
function handbook_get_sections(int $catalogId = 0): array {
$url = HANDBOOK_SERVICE_URL . '/api/v1/sections/global?catalog_id=' . $catalogId;
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_TIMEOUT => 25,
]);
$body = curl_exec($ch);
curl_close($ch);
return json_decode($body, true) ?? [];
}
/**
* Generate handbook via POST with custom options.
*/
function handbook_generate(array $options = []): string {
$url = HANDBOOK_SERVICE_URL . '/api/v1/handbook/render';
$payload = json_encode(array_merge([
'catalog_id' => 0,
'include_inactive_programs' => false,
'debug' => false,
'output_format' => 'pdf',
], $options));
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_HTTPHEADER => ['Content-Type: application/json'],
CURLOPT_TIMEOUT => 120,
]);
$body = curl_exec($ch);
curl_close($ch);
return $body;
}
Usage in PHP
// Health check
$status = handbook_health();
if ($status['status'] === 'ok') {
echo "Service is running\n";
}
// Stream PDF to browser
handbook_download_pdf(catalogId: 1);
// Get sections data
$sections = handbook_get_sections(catalogId: 1);
print_r($sections);
Migration Notes & Assumptions
What was migrated
| PHP Component | Python Equivalent | Notes |
|---|---|---|
common.php (URL builder, HTTP client) |
data_fetcher.py |
Uses httpx instead of cURL |
cors.php |
FastAPI CORS middleware | Same origins preserved |
helpers.php (h(), respondJson()) |
Built into FastAPI + utils.py |
|
fetchers.php (global/uni data fetch) |
data_fetcher.py |
Identical normalisation logic |
renderers.php (TOC, blocks, university) |
renderers.py |
All block types preserved |
html_builder.php (buildHandbookHtml) |
html_builder.py |
Same HTML structure |
pdf.php (Dompdf render) |
pdf_service.py |
WeasyPrint replaces Dompdf |
images.php (image config) |
pdf_service.py _get_images_config() |
|
font_diagnostics.php |
GET /diagnostics/fonts |
|
db.php (mysqli) |
database.py (SQLAlchemy) |
Available but not primary path |
Key differences
PDF engine: WeasyPrint replaces Dompdf. Layout may differ slightly in edge cases (table widths, page breaks). Both support
@font-facewith base64 TTF and@pagerules.TOC page numbers: The PHP code uses a 2-pass Dompdf render to inject exact TOC page numbers via named destinations. WeasyPrint doesn't expose named destinations the same way. TOC pages are assigned sequentially in the initial migration. Exact page numbers can be added via a post-processing PDF pass if needed.
No auth: The PHP code has no authentication. The Python service also has none. Add API key middleware if this service is exposed publicly.
Data source: The service fetches data from the same two PHP JSON APIs over HTTP (not directly from the database). The
repositories/handbook_repo.pyprovides a DB fallback if you want to bypass the PHP APIs entirely.SSL verification: Disabled for internal API calls (
verify=Falsein httpx), matching the PHP behavior (CURLOPT_SSL_VERIFYPEER => false).
Risks
- Font rendering: Century Gothic rendering may differ slightly between Dompdf (PHP) and WeasyPrint (Python). Test with actual fonts.
- Page break behavior: Dompdf and WeasyPrint handle CSS
page-break-*properties slightly differently. - Image embedding: Remote campus images are fetched at generation time. Network issues will result in placeholder cells (same as PHP behavior).
- Memory: Large handbooks with many university images may require significant memory. The Dockerfile doesn't set memory limits β Hugging Face Spaces has its own limits.