Upload folder using huggingface_hub
Browse files- cloudflare/WASM-PROXY.md +104 -0
- cloudflare/cors-proxy-worker.js +351 -0
- cloudflare/wasm-proxy-worker.js +356 -0
- cloudflare/wasm-wrangler.toml +69 -0
- cloudflare/wrangler.toml +45 -0
cloudflare/WASM-PROXY.md
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# WASM Proxy Setup Guide
|
| 2 |
+
|
| 3 |
+
BentoPDF uses a Cloudflare Worker to proxy WASM library requests, bypassing CORS restrictions when loading AGPL-licensed components (PyMuPDF, Ghostscript, CoherentPDF) from external sources.
|
| 4 |
+
|
| 5 |
+
## Quick Start
|
| 6 |
+
|
| 7 |
+
### 1. Deploy the Worker
|
| 8 |
+
|
| 9 |
+
```bash
|
| 10 |
+
cd cloudflare
|
| 11 |
+
npx wrangler login
|
| 12 |
+
npx wrangler deploy -c wasm-wrangler.toml
|
| 13 |
+
```
|
| 14 |
+
|
| 15 |
+
### 2. Configure Source URLs
|
| 16 |
+
|
| 17 |
+
Set environment secrets with the base URLs for your WASM files:
|
| 18 |
+
|
| 19 |
+
```bash
|
| 20 |
+
# Option A: Interactive prompts
|
| 21 |
+
npx wrangler secret put PYMUPDF_SOURCE -c wasm-wrangler.toml
|
| 22 |
+
npx wrangler secret put GS_SOURCE -c wasm-wrangler.toml
|
| 23 |
+
npx wrangler secret put CPDF_SOURCE -c wasm-wrangler.toml
|
| 24 |
+
|
| 25 |
+
# Option B: Set via Cloudflare Dashboard
|
| 26 |
+
# Go to Workers & Pages > bentopdf-wasm-proxy > Settings > Variables
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
**Recommended Source URLs:**
|
| 30 |
+
|
| 31 |
+
- PYMUPDF_SOURCE: `https://cdn.jsdelivr.net/npm/@bentopdf/pymupdf-wasm@0.11.14/`
|
| 32 |
+
- GS_SOURCE: `https://cdn.jsdelivr.net/npm/@bentopdf/gs-wasm/assets/`
|
| 33 |
+
- CPDF_SOURCE: `https://cdn.jsdelivr.net/npm/coherentpdf/dist/`
|
| 34 |
+
|
| 35 |
+
> **Note:** You can use your own hosted WASM files instead of the recommended URLs. Just ensure your files match the expected directory structure and file names that BentoPDF expects for each module.
|
| 36 |
+
|
| 37 |
+
### 3. Configure BentoPDF
|
| 38 |
+
|
| 39 |
+
**Option A: Environment variables (recommended — zero-config for users)**
|
| 40 |
+
|
| 41 |
+
Set these in `.env.production` or pass as Docker build args:
|
| 42 |
+
|
| 43 |
+
```bash
|
| 44 |
+
VITE_WASM_PYMUPDF_URL=https://bentopdf-wasm-proxy.<your-subdomain>.workers.dev/pymupdf/
|
| 45 |
+
VITE_WASM_GS_URL=https://bentopdf-wasm-proxy.<your-subdomain>.workers.dev/gs/
|
| 46 |
+
VITE_WASM_CPDF_URL=https://bentopdf-wasm-proxy.<your-subdomain>.workers.dev/cpdf/
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
**Option B: Manual per-user configuration**
|
| 50 |
+
|
| 51 |
+
In BentoPDF's Advanced Settings (wasm-settings.html), enter:
|
| 52 |
+
|
| 53 |
+
| Module | URL |
|
| 54 |
+
| ----------- | ------------------------------------------------------------------- |
|
| 55 |
+
| PyMuPDF | `https://bentopdf-wasm-proxy.<your-subdomain>.workers.dev/pymupdf/` |
|
| 56 |
+
| Ghostscript | `https://bentopdf-wasm-proxy.<your-subdomain>.workers.dev/gs/` |
|
| 57 |
+
| CoherentPDF | `https://bentopdf-wasm-proxy.<your-subdomain>.workers.dev/cpdf/` |
|
| 58 |
+
|
| 59 |
+
## Custom Domain (Optional)
|
| 60 |
+
|
| 61 |
+
To use a custom domain like `wasm.bentopdf.com`:
|
| 62 |
+
|
| 63 |
+
1. Add route in `wasm-wrangler.toml`:
|
| 64 |
+
|
| 65 |
+
```toml
|
| 66 |
+
routes = [
|
| 67 |
+
{ pattern = "wasm.bentopdf.com/*", zone_name = "bentopdf.com" }
|
| 68 |
+
]
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
2. Add DNS record in Cloudflare:
|
| 72 |
+
- Type: AAAA
|
| 73 |
+
- Name: wasm
|
| 74 |
+
- Content: 100::
|
| 75 |
+
- Proxied: Yes
|
| 76 |
+
|
| 77 |
+
3. Redeploy:
|
| 78 |
+
|
| 79 |
+
```bash
|
| 80 |
+
npx wrangler deploy -c wasm-wrangler.toml
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
## Security Features
|
| 84 |
+
|
| 85 |
+
- **Origin validation**: Only allows requests from configured origins
|
| 86 |
+
- **Rate limiting**: 100 requests/minute per IP (requires KV namespace)
|
| 87 |
+
- **File type restrictions**: Only WASM-related files (.js, .wasm, .data, etc.)
|
| 88 |
+
- **Size limits**: Max 100MB per file
|
| 89 |
+
- **Caching**: Reduces origin requests and improves performance
|
| 90 |
+
|
| 91 |
+
## Self-Hosting Notes
|
| 92 |
+
|
| 93 |
+
1. Update `ALLOWED_ORIGINS` in `wasm-proxy-worker.js` to include your domain
|
| 94 |
+
2. Host your WASM files on any origin (R2, S3, or any CDN)
|
| 95 |
+
3. Set source URLs as secrets in your worker
|
| 96 |
+
|
| 97 |
+
## Endpoints
|
| 98 |
+
|
| 99 |
+
| Endpoint | Description |
|
| 100 |
+
| ------------ | -------------------------------------- |
|
| 101 |
+
| `/` | Health check, shows configured modules |
|
| 102 |
+
| `/pymupdf/*` | PyMuPDF WASM files |
|
| 103 |
+
| `/gs/*` | Ghostscript WASM files |
|
| 104 |
+
| `/cpdf/*` | CoherentPDF files |
|
cloudflare/cors-proxy-worker.js
ADDED
|
@@ -0,0 +1,351 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
/**
|
| 2 |
+
* BentoPDF CORS Proxy Worker
|
| 3 |
+
*
|
| 4 |
+
* This Cloudflare Worker proxies certificate requests for the digital signing tool.
|
| 5 |
+
* It fetches certificates from external CAs that don't have CORS headers enabled
|
| 6 |
+
* and returns them with proper CORS headers.
|
| 7 |
+
*
|
| 8 |
+
*
|
| 9 |
+
* Deploy: npx wrangler deploy
|
| 10 |
+
*
|
| 11 |
+
* Required Environment Variables (set in wrangler.toml or Cloudflare dashboard):
|
| 12 |
+
* - PROXY_SECRET: Shared secret for HMAC signature verification
|
| 13 |
+
*/
|
| 14 |
+
|
| 15 |
+
const ALLOWED_PATTERNS = [
|
| 16 |
+
/\.crt$/i,
|
| 17 |
+
/\.cer$/i,
|
| 18 |
+
/\.pem$/i,
|
| 19 |
+
/\/certs\//i,
|
| 20 |
+
/\/ocsp/i,
|
| 21 |
+
/\/crl/i,
|
| 22 |
+
/caIssuers/i,
|
| 23 |
+
];
|
| 24 |
+
|
| 25 |
+
const ALLOWED_ORIGINS = [
|
| 26 |
+
'https://www.bentopdf.com',
|
| 27 |
+
'https://bentopdf.com',
|
| 28 |
+
];
|
| 29 |
+
|
| 30 |
+
const BLOCKED_DOMAINS = [
|
| 31 |
+
'localhost',
|
| 32 |
+
'127.0.0.1',
|
| 33 |
+
'0.0.0.0',
|
| 34 |
+
];
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
const MAX_TIMESTAMP_AGE_MS = 5 * 60 * 1000;
|
| 38 |
+
|
| 39 |
+
const RATE_LIMIT_MAX_REQUESTS = 60;
|
| 40 |
+
const RATE_LIMIT_WINDOW_MS = 60 * 1000;
|
| 41 |
+
|
| 42 |
+
const MAX_FILE_SIZE_BYTES = 10 * 1024 * 1024;
|
| 43 |
+
|
| 44 |
+
async function verifySignature(message, signature, secret) {
|
| 45 |
+
try {
|
| 46 |
+
const encoder = new TextEncoder();
|
| 47 |
+
const key = await crypto.subtle.importKey(
|
| 48 |
+
'raw',
|
| 49 |
+
encoder.encode(secret),
|
| 50 |
+
{ name: 'HMAC', hash: 'SHA-256' },
|
| 51 |
+
false,
|
| 52 |
+
['verify']
|
| 53 |
+
);
|
| 54 |
+
|
| 55 |
+
const signatureBytes = new Uint8Array(
|
| 56 |
+
signature.match(/.{1,2}/g).map(byte => parseInt(byte, 16))
|
| 57 |
+
);
|
| 58 |
+
|
| 59 |
+
return await crypto.subtle.verify(
|
| 60 |
+
'HMAC',
|
| 61 |
+
key,
|
| 62 |
+
signatureBytes,
|
| 63 |
+
encoder.encode(message)
|
| 64 |
+
);
|
| 65 |
+
} catch (e) {
|
| 66 |
+
console.error('Signature verification error:', e);
|
| 67 |
+
return false;
|
| 68 |
+
}
|
| 69 |
+
}
|
| 70 |
+
|
| 71 |
+
async function generateSignature(message, secret) {
|
| 72 |
+
const encoder = new TextEncoder();
|
| 73 |
+
const key = await crypto.subtle.importKey(
|
| 74 |
+
'raw',
|
| 75 |
+
encoder.encode(secret),
|
| 76 |
+
{ name: 'HMAC', hash: 'SHA-256' },
|
| 77 |
+
false,
|
| 78 |
+
['sign']
|
| 79 |
+
);
|
| 80 |
+
|
| 81 |
+
const signature = await crypto.subtle.sign(
|
| 82 |
+
'HMAC',
|
| 83 |
+
key,
|
| 84 |
+
encoder.encode(message)
|
| 85 |
+
);
|
| 86 |
+
|
| 87 |
+
return Array.from(new Uint8Array(signature))
|
| 88 |
+
.map(b => b.toString(16).padStart(2, '0'))
|
| 89 |
+
.join('');
|
| 90 |
+
}
|
| 91 |
+
|
| 92 |
+
function isAllowedOrigin(origin) {
|
| 93 |
+
if (!origin) return false;
|
| 94 |
+
return ALLOWED_ORIGINS.some(allowed => origin.startsWith(allowed.replace(/\/$/, '')));
|
| 95 |
+
}
|
| 96 |
+
|
| 97 |
+
function isValidCertificateUrl(urlString) {
|
| 98 |
+
try {
|
| 99 |
+
const url = new URL(urlString);
|
| 100 |
+
|
| 101 |
+
if (!['http:', 'https:'].includes(url.protocol)) {
|
| 102 |
+
return false;
|
| 103 |
+
}
|
| 104 |
+
|
| 105 |
+
if (BLOCKED_DOMAINS.some(domain => url.hostname.includes(domain))) {
|
| 106 |
+
return false;
|
| 107 |
+
}
|
| 108 |
+
|
| 109 |
+
const hostname = url.hostname;
|
| 110 |
+
if (/^10\./.test(hostname) ||
|
| 111 |
+
/^172\.(1[6-9]|2[0-9]|3[0-1])\./.test(hostname) ||
|
| 112 |
+
/^192\.168\./.test(hostname)) {
|
| 113 |
+
return false;
|
| 114 |
+
}
|
| 115 |
+
|
| 116 |
+
return ALLOWED_PATTERNS.some(pattern => pattern.test(urlString));
|
| 117 |
+
} catch {
|
| 118 |
+
return false;
|
| 119 |
+
}
|
| 120 |
+
}
|
| 121 |
+
|
| 122 |
+
function corsHeaders(origin) {
|
| 123 |
+
return {
|
| 124 |
+
'Access-Control-Allow-Origin': origin || '*',
|
| 125 |
+
'Access-Control-Allow-Methods': 'GET, OPTIONS',
|
| 126 |
+
'Access-Control-Allow-Headers': 'Content-Type',
|
| 127 |
+
'Access-Control-Max-Age': '86400',
|
| 128 |
+
};
|
| 129 |
+
}
|
| 130 |
+
|
| 131 |
+
function handleOptions(request) {
|
| 132 |
+
const origin = request.headers.get('Origin');
|
| 133 |
+
return new Response(null, {
|
| 134 |
+
status: 204,
|
| 135 |
+
headers: corsHeaders(origin),
|
| 136 |
+
});
|
| 137 |
+
}
|
| 138 |
+
|
| 139 |
+
export default {
|
| 140 |
+
async fetch(request, env, ctx) {
|
| 141 |
+
const url = new URL(request.url);
|
| 142 |
+
const origin = request.headers.get('Origin');
|
| 143 |
+
|
| 144 |
+
if (request.method === 'OPTIONS') {
|
| 145 |
+
return handleOptions(request);
|
| 146 |
+
}
|
| 147 |
+
|
| 148 |
+
// NOTE: If you are selfhosting this proxy, you can remove this check, or can set it to only accept requests from your own domain
|
| 149 |
+
if (!isAllowedOrigin(origin)) {
|
| 150 |
+
return new Response(JSON.stringify({
|
| 151 |
+
error: 'Forbidden',
|
| 152 |
+
message: 'This proxy only accepts requests from bentopdf.com',
|
| 153 |
+
}), {
|
| 154 |
+
status: 403,
|
| 155 |
+
headers: {
|
| 156 |
+
'Content-Type': 'application/json',
|
| 157 |
+
},
|
| 158 |
+
});
|
| 159 |
+
}
|
| 160 |
+
|
| 161 |
+
if (request.method !== 'GET') {
|
| 162 |
+
return new Response('Method not allowed', {
|
| 163 |
+
status: 405,
|
| 164 |
+
headers: corsHeaders(origin),
|
| 165 |
+
});
|
| 166 |
+
}
|
| 167 |
+
|
| 168 |
+
const targetUrl = url.searchParams.get('url');
|
| 169 |
+
const timestamp = url.searchParams.get('t');
|
| 170 |
+
const signature = url.searchParams.get('sig');
|
| 171 |
+
|
| 172 |
+
if (env.PROXY_SECRET) {
|
| 173 |
+
if (!timestamp || !signature) {
|
| 174 |
+
return new Response(JSON.stringify({
|
| 175 |
+
error: 'Missing authentication parameters',
|
| 176 |
+
message: 'Request must include timestamp (t) and signature (sig) parameters',
|
| 177 |
+
}), {
|
| 178 |
+
status: 401,
|
| 179 |
+
headers: {
|
| 180 |
+
...corsHeaders(origin),
|
| 181 |
+
'Content-Type': 'application/json',
|
| 182 |
+
},
|
| 183 |
+
});
|
| 184 |
+
}
|
| 185 |
+
|
| 186 |
+
const requestTime = parseInt(timestamp, 10);
|
| 187 |
+
const now = Date.now();
|
| 188 |
+
if (isNaN(requestTime) || Math.abs(now - requestTime) > MAX_TIMESTAMP_AGE_MS) {
|
| 189 |
+
return new Response(JSON.stringify({
|
| 190 |
+
error: 'Request expired or invalid timestamp',
|
| 191 |
+
message: 'Timestamp must be within 5 minutes of current time',
|
| 192 |
+
}), {
|
| 193 |
+
status: 401,
|
| 194 |
+
headers: {
|
| 195 |
+
...corsHeaders(origin),
|
| 196 |
+
'Content-Type': 'application/json',
|
| 197 |
+
},
|
| 198 |
+
});
|
| 199 |
+
}
|
| 200 |
+
|
| 201 |
+
const message = `${targetUrl}${timestamp}`;
|
| 202 |
+
const isValid = await verifySignature(message, signature, env.PROXY_SECRET);
|
| 203 |
+
|
| 204 |
+
if (!isValid) {
|
| 205 |
+
return new Response(JSON.stringify({
|
| 206 |
+
error: 'Invalid signature',
|
| 207 |
+
message: 'Request signature verification failed',
|
| 208 |
+
}), {
|
| 209 |
+
status: 401,
|
| 210 |
+
headers: {
|
| 211 |
+
...corsHeaders(origin),
|
| 212 |
+
'Content-Type': 'application/json',
|
| 213 |
+
},
|
| 214 |
+
});
|
| 215 |
+
}
|
| 216 |
+
}
|
| 217 |
+
|
| 218 |
+
if (!targetUrl) {
|
| 219 |
+
return new Response(JSON.stringify({
|
| 220 |
+
error: 'Missing url parameter',
|
| 221 |
+
usage: 'GET /?url=<certificate_url>',
|
| 222 |
+
}), {
|
| 223 |
+
status: 400,
|
| 224 |
+
headers: {
|
| 225 |
+
...corsHeaders(origin),
|
| 226 |
+
'Content-Type': 'application/json',
|
| 227 |
+
},
|
| 228 |
+
});
|
| 229 |
+
}
|
| 230 |
+
|
| 231 |
+
if (!isValidCertificateUrl(targetUrl)) {
|
| 232 |
+
return new Response(JSON.stringify({
|
| 233 |
+
error: 'Invalid or disallowed URL',
|
| 234 |
+
message: 'Only certificate-related URLs are allowed (*.crt, *.cer, *.pem, /certs/, /ocsp, /crl)',
|
| 235 |
+
}), {
|
| 236 |
+
status: 403,
|
| 237 |
+
headers: {
|
| 238 |
+
...corsHeaders(origin),
|
| 239 |
+
'Content-Type': 'application/json',
|
| 240 |
+
},
|
| 241 |
+
});
|
| 242 |
+
}
|
| 243 |
+
|
| 244 |
+
const clientIP = request.headers.get('CF-Connecting-IP') || 'unknown';
|
| 245 |
+
const rateLimitKey = `ratelimit:${clientIP}`;
|
| 246 |
+
const now = Date.now();
|
| 247 |
+
|
| 248 |
+
if (env.RATE_LIMIT_KV) {
|
| 249 |
+
const rateLimitData = await env.RATE_LIMIT_KV.get(rateLimitKey, { type: 'json' });
|
| 250 |
+
const requests = rateLimitData?.requests || [];
|
| 251 |
+
|
| 252 |
+
const recentRequests = requests.filter(t => now - t < RATE_LIMIT_WINDOW_MS);
|
| 253 |
+
|
| 254 |
+
if (recentRequests.length >= RATE_LIMIT_MAX_REQUESTS) {
|
| 255 |
+
return new Response(JSON.stringify({
|
| 256 |
+
error: 'Rate limit exceeded',
|
| 257 |
+
message: `Maximum ${RATE_LIMIT_MAX_REQUESTS} requests per minute. Please try again later.`,
|
| 258 |
+
retryAfter: Math.ceil((recentRequests[0] + RATE_LIMIT_WINDOW_MS - now) / 1000),
|
| 259 |
+
}), {
|
| 260 |
+
status: 429,
|
| 261 |
+
headers: {
|
| 262 |
+
...corsHeaders(origin),
|
| 263 |
+
'Content-Type': 'application/json',
|
| 264 |
+
'Retry-After': Math.ceil((recentRequests[0] + RATE_LIMIT_WINDOW_MS - now) / 1000).toString(),
|
| 265 |
+
},
|
| 266 |
+
});
|
| 267 |
+
}
|
| 268 |
+
|
| 269 |
+
recentRequests.push(now);
|
| 270 |
+
await env.RATE_LIMIT_KV.put(rateLimitKey, JSON.stringify({ requests: recentRequests }), {
|
| 271 |
+
expirationTtl: 120,
|
| 272 |
+
});
|
| 273 |
+
}
|
| 274 |
+
|
| 275 |
+
try {
|
| 276 |
+
const response = await fetch(targetUrl, {
|
| 277 |
+
headers: {
|
| 278 |
+
'User-Agent': 'BentoPDF-CertProxy/1.0',
|
| 279 |
+
},
|
| 280 |
+
});
|
| 281 |
+
|
| 282 |
+
if (!response.ok) {
|
| 283 |
+
return new Response(JSON.stringify({
|
| 284 |
+
error: 'Failed to fetch certificate',
|
| 285 |
+
status: response.status,
|
| 286 |
+
statusText: response.statusText,
|
| 287 |
+
}), {
|
| 288 |
+
status: response.status,
|
| 289 |
+
headers: {
|
| 290 |
+
...corsHeaders(origin),
|
| 291 |
+
'Content-Type': 'application/json',
|
| 292 |
+
},
|
| 293 |
+
});
|
| 294 |
+
}
|
| 295 |
+
|
| 296 |
+
const contentLength = parseInt(response.headers.get('Content-Length') || '0', 10);
|
| 297 |
+
if (contentLength > MAX_FILE_SIZE_BYTES) {
|
| 298 |
+
return new Response(JSON.stringify({
|
| 299 |
+
error: 'File too large',
|
| 300 |
+
message: `Certificate file exceeds maximum size of ${MAX_FILE_SIZE_BYTES / 1024}KB`,
|
| 301 |
+
size: contentLength,
|
| 302 |
+
maxSize: MAX_FILE_SIZE_BYTES,
|
| 303 |
+
}), {
|
| 304 |
+
status: 413,
|
| 305 |
+
headers: {
|
| 306 |
+
...corsHeaders(origin),
|
| 307 |
+
'Content-Type': 'application/json',
|
| 308 |
+
},
|
| 309 |
+
});
|
| 310 |
+
}
|
| 311 |
+
|
| 312 |
+
const certData = await response.arrayBuffer();
|
| 313 |
+
|
| 314 |
+
if (certData.byteLength > MAX_FILE_SIZE_BYTES) {
|
| 315 |
+
return new Response(JSON.stringify({
|
| 316 |
+
error: 'File too large',
|
| 317 |
+
message: `Certificate file exceeds maximum size of ${MAX_FILE_SIZE_BYTES / 1024}KB`,
|
| 318 |
+
size: certData.byteLength,
|
| 319 |
+
maxSize: MAX_FILE_SIZE_BYTES,
|
| 320 |
+
}), {
|
| 321 |
+
status: 413,
|
| 322 |
+
headers: {
|
| 323 |
+
...corsHeaders(origin),
|
| 324 |
+
'Content-Type': 'application/json',
|
| 325 |
+
},
|
| 326 |
+
});
|
| 327 |
+
}
|
| 328 |
+
|
| 329 |
+
return new Response(certData, {
|
| 330 |
+
status: 200,
|
| 331 |
+
headers: {
|
| 332 |
+
...corsHeaders(origin),
|
| 333 |
+
'Content-Type': response.headers.get('Content-Type') || 'application/x-x509-ca-cert',
|
| 334 |
+
'Content-Length': certData.byteLength.toString(),
|
| 335 |
+
'Cache-Control': 'public, max-age=86400',
|
| 336 |
+
},
|
| 337 |
+
});
|
| 338 |
+
} catch (error) {
|
| 339 |
+
return new Response(JSON.stringify({
|
| 340 |
+
error: 'Proxy error',
|
| 341 |
+
message: error.message,
|
| 342 |
+
}), {
|
| 343 |
+
status: 500,
|
| 344 |
+
headers: {
|
| 345 |
+
...corsHeaders(origin),
|
| 346 |
+
'Content-Type': 'application/json',
|
| 347 |
+
},
|
| 348 |
+
});
|
| 349 |
+
}
|
| 350 |
+
},
|
| 351 |
+
};
|
cloudflare/wasm-proxy-worker.js
ADDED
|
@@ -0,0 +1,356 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
/**
|
| 2 |
+
* BentoPDF WASM Proxy Worker
|
| 3 |
+
*
|
| 4 |
+
* This Cloudflare Worker proxies WASM module requests to bypass CORS restrictions.
|
| 5 |
+
* It fetches WASM libraries (PyMuPDF, Ghostscript, CoherentPDF) from configured sources
|
| 6 |
+
* and serves them with proper CORS headers.
|
| 7 |
+
*
|
| 8 |
+
* Endpoints:
|
| 9 |
+
* - /pymupdf/* - Proxies to PyMuPDF WASM source
|
| 10 |
+
* - /gs/* - Proxies to Ghostscript WASM source
|
| 11 |
+
* - /cpdf/* - Proxies to CoherentPDF WASM source
|
| 12 |
+
*
|
| 13 |
+
* Deploy: cd cloudflare && npx wrangler deploy -c wasm-wrangler.toml
|
| 14 |
+
*
|
| 15 |
+
* Required Environment Variables (set in Cloudflare dashboard):
|
| 16 |
+
* - PYMUPDF_SOURCE: Base URL for PyMuPDF WASM files (e.g., https://cdn.example.com/pymupdf)
|
| 17 |
+
* - GS_SOURCE: Base URL for Ghostscript WASM files (e.g., https://cdn.example.com/gs)
|
| 18 |
+
* - CPDF_SOURCE: Base URL for CoherentPDF files (e.g., https://cdn.example.com/cpdf)
|
| 19 |
+
*/
|
| 20 |
+
|
| 21 |
+
const ALLOWED_ORIGINS = ['https://www.bentopdf.com', 'https://bentopdf.com'];
|
| 22 |
+
|
| 23 |
+
const MAX_FILE_SIZE_BYTES = 100 * 1024 * 1024;
|
| 24 |
+
|
| 25 |
+
const RATE_LIMIT_MAX_REQUESTS = 100;
|
| 26 |
+
const RATE_LIMIT_WINDOW_MS = 60 * 1000;
|
| 27 |
+
|
| 28 |
+
const CACHE_TTL_SECONDS = 604800;
|
| 29 |
+
|
| 30 |
+
const ALLOWED_EXTENSIONS = [
|
| 31 |
+
'.js',
|
| 32 |
+
'.mjs',
|
| 33 |
+
'.wasm',
|
| 34 |
+
'.data',
|
| 35 |
+
'.py',
|
| 36 |
+
'.so',
|
| 37 |
+
'.zip',
|
| 38 |
+
'.json',
|
| 39 |
+
'.mem',
|
| 40 |
+
'.asm.js',
|
| 41 |
+
'.worker.js',
|
| 42 |
+
'.html',
|
| 43 |
+
];
|
| 44 |
+
|
| 45 |
+
function isAllowedOrigin(origin) {
|
| 46 |
+
if (!origin) return true; // Allow no-origin requests (e.g., direct browser navigation)
|
| 47 |
+
return ALLOWED_ORIGINS.some((allowed) =>
|
| 48 |
+
origin.startsWith(allowed.replace(/\/$/, ''))
|
| 49 |
+
);
|
| 50 |
+
}
|
| 51 |
+
|
| 52 |
+
function isAllowedFile(pathname) {
|
| 53 |
+
const ext = pathname.substring(pathname.lastIndexOf('.')).toLowerCase();
|
| 54 |
+
if (ALLOWED_EXTENSIONS.includes(ext)) return true;
|
| 55 |
+
|
| 56 |
+
if (!pathname.includes('.') || pathname.endsWith('/')) return true;
|
| 57 |
+
|
| 58 |
+
return false;
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
function corsHeaders(origin) {
|
| 62 |
+
return {
|
| 63 |
+
'Access-Control-Allow-Origin': origin || '*',
|
| 64 |
+
'Access-Control-Allow-Methods': 'GET, HEAD, OPTIONS',
|
| 65 |
+
'Access-Control-Allow-Headers': 'Content-Type, Range, Cache-Control',
|
| 66 |
+
'Access-Control-Expose-Headers':
|
| 67 |
+
'Content-Length, Content-Range, Content-Type',
|
| 68 |
+
'Access-Control-Max-Age': '86400',
|
| 69 |
+
};
|
| 70 |
+
}
|
| 71 |
+
|
| 72 |
+
function handleOptions(request) {
|
| 73 |
+
const origin = request.headers.get('Origin');
|
| 74 |
+
return new Response(null, {
|
| 75 |
+
status: 204,
|
| 76 |
+
headers: corsHeaders(origin),
|
| 77 |
+
});
|
| 78 |
+
}
|
| 79 |
+
|
| 80 |
+
function getContentType(pathname) {
|
| 81 |
+
const ext = pathname.substring(pathname.lastIndexOf('.')).toLowerCase();
|
| 82 |
+
const contentTypes = {
|
| 83 |
+
'.js': 'application/javascript',
|
| 84 |
+
'.mjs': 'application/javascript',
|
| 85 |
+
'.wasm': 'application/wasm',
|
| 86 |
+
'.json': 'application/json',
|
| 87 |
+
'.data': 'application/octet-stream',
|
| 88 |
+
'.py': 'text/x-python',
|
| 89 |
+
'.so': 'application/octet-stream',
|
| 90 |
+
'.zip': 'application/zip',
|
| 91 |
+
'.mem': 'application/octet-stream',
|
| 92 |
+
'.html': 'text/html',
|
| 93 |
+
};
|
| 94 |
+
return contentTypes[ext] || 'application/octet-stream';
|
| 95 |
+
}
|
| 96 |
+
|
| 97 |
+
async function proxyRequest(request, env, sourceBaseUrl, subpath, origin) {
|
| 98 |
+
if (!sourceBaseUrl) {
|
| 99 |
+
return new Response(
|
| 100 |
+
JSON.stringify({
|
| 101 |
+
error: 'Source not configured',
|
| 102 |
+
message: 'This WASM module source URL has not been configured.',
|
| 103 |
+
}),
|
| 104 |
+
{
|
| 105 |
+
status: 503,
|
| 106 |
+
headers: {
|
| 107 |
+
...corsHeaders(origin),
|
| 108 |
+
'Content-Type': 'application/json',
|
| 109 |
+
},
|
| 110 |
+
}
|
| 111 |
+
);
|
| 112 |
+
}
|
| 113 |
+
|
| 114 |
+
const normalizedBase = sourceBaseUrl.endsWith('/')
|
| 115 |
+
? sourceBaseUrl.slice(0, -1)
|
| 116 |
+
: sourceBaseUrl;
|
| 117 |
+
const normalizedPath = subpath.startsWith('/') ? subpath : `/${subpath}`;
|
| 118 |
+
const targetUrl = `${normalizedBase}${normalizedPath}`;
|
| 119 |
+
|
| 120 |
+
if (!isAllowedFile(normalizedPath)) {
|
| 121 |
+
return new Response(
|
| 122 |
+
JSON.stringify({
|
| 123 |
+
error: 'Forbidden file type',
|
| 124 |
+
message: 'Only WASM-related file types are allowed.',
|
| 125 |
+
}),
|
| 126 |
+
{
|
| 127 |
+
status: 403,
|
| 128 |
+
headers: {
|
| 129 |
+
...corsHeaders(origin),
|
| 130 |
+
'Content-Type': 'application/json',
|
| 131 |
+
},
|
| 132 |
+
}
|
| 133 |
+
);
|
| 134 |
+
}
|
| 135 |
+
|
| 136 |
+
try {
|
| 137 |
+
const cacheKey = new Request(targetUrl, request);
|
| 138 |
+
const cache = caches.default;
|
| 139 |
+
let response = await cache.match(cacheKey);
|
| 140 |
+
|
| 141 |
+
if (!response) {
|
| 142 |
+
response = await fetch(targetUrl, {
|
| 143 |
+
headers: {
|
| 144 |
+
'User-Agent': 'BentoPDF-WASM-Proxy/1.0',
|
| 145 |
+
Accept: '*/*',
|
| 146 |
+
},
|
| 147 |
+
});
|
| 148 |
+
|
| 149 |
+
if (!response.ok) {
|
| 150 |
+
return new Response(
|
| 151 |
+
JSON.stringify({
|
| 152 |
+
error: 'Failed to fetch resource',
|
| 153 |
+
status: response.status,
|
| 154 |
+
statusText: response.statusText,
|
| 155 |
+
targetUrl: targetUrl,
|
| 156 |
+
}),
|
| 157 |
+
{
|
| 158 |
+
status: response.status,
|
| 159 |
+
headers: {
|
| 160 |
+
...corsHeaders(origin),
|
| 161 |
+
'Content-Type': 'application/json',
|
| 162 |
+
},
|
| 163 |
+
}
|
| 164 |
+
);
|
| 165 |
+
}
|
| 166 |
+
|
| 167 |
+
const contentLength = parseInt(
|
| 168 |
+
response.headers.get('Content-Length') || '0',
|
| 169 |
+
10
|
| 170 |
+
);
|
| 171 |
+
if (contentLength > MAX_FILE_SIZE_BYTES) {
|
| 172 |
+
return new Response(
|
| 173 |
+
JSON.stringify({
|
| 174 |
+
error: 'File too large',
|
| 175 |
+
message: `File exceeds maximum size of ${MAX_FILE_SIZE_BYTES / 1024 / 1024}MB`,
|
| 176 |
+
}),
|
| 177 |
+
{
|
| 178 |
+
status: 413,
|
| 179 |
+
headers: {
|
| 180 |
+
...corsHeaders(origin),
|
| 181 |
+
'Content-Type': 'application/json',
|
| 182 |
+
},
|
| 183 |
+
}
|
| 184 |
+
);
|
| 185 |
+
}
|
| 186 |
+
|
| 187 |
+
response = new Response(response.body, response);
|
| 188 |
+
response.headers.set(
|
| 189 |
+
'Cache-Control',
|
| 190 |
+
`public, max-age=${CACHE_TTL_SECONDS}`
|
| 191 |
+
);
|
| 192 |
+
|
| 193 |
+
if (response.status === 200) {
|
| 194 |
+
await cache.put(cacheKey, response.clone());
|
| 195 |
+
}
|
| 196 |
+
}
|
| 197 |
+
|
| 198 |
+
const bodyData = await response.arrayBuffer();
|
| 199 |
+
|
| 200 |
+
return new Response(bodyData, {
|
| 201 |
+
status: 200,
|
| 202 |
+
headers: {
|
| 203 |
+
...corsHeaders(origin),
|
| 204 |
+
'Content-Type': getContentType(normalizedPath),
|
| 205 |
+
'Content-Length': bodyData.byteLength.toString(),
|
| 206 |
+
'Cache-Control': `public, max-age=${CACHE_TTL_SECONDS}`,
|
| 207 |
+
'X-Proxied-From': new URL(targetUrl).hostname,
|
| 208 |
+
},
|
| 209 |
+
});
|
| 210 |
+
} catch (error) {
|
| 211 |
+
return new Response(
|
| 212 |
+
JSON.stringify({
|
| 213 |
+
error: 'Proxy error',
|
| 214 |
+
message: error.message,
|
| 215 |
+
}),
|
| 216 |
+
{
|
| 217 |
+
status: 500,
|
| 218 |
+
headers: {
|
| 219 |
+
...corsHeaders(origin),
|
| 220 |
+
'Content-Type': 'application/json',
|
| 221 |
+
},
|
| 222 |
+
}
|
| 223 |
+
);
|
| 224 |
+
}
|
| 225 |
+
}
|
| 226 |
+
|
| 227 |
+
export default {
|
| 228 |
+
async fetch(request, env, ctx) {
|
| 229 |
+
const url = new URL(request.url);
|
| 230 |
+
const pathname = url.pathname;
|
| 231 |
+
const origin = request.headers.get('Origin');
|
| 232 |
+
|
| 233 |
+
if (request.method === 'OPTIONS') {
|
| 234 |
+
return handleOptions(request);
|
| 235 |
+
}
|
| 236 |
+
|
| 237 |
+
if (!isAllowedOrigin(origin)) {
|
| 238 |
+
return new Response(
|
| 239 |
+
JSON.stringify({
|
| 240 |
+
error: 'Forbidden',
|
| 241 |
+
message:
|
| 242 |
+
'Origin not allowed. Add your domain to ALLOWED_ORIGINS if self-hosting.',
|
| 243 |
+
}),
|
| 244 |
+
{
|
| 245 |
+
status: 403,
|
| 246 |
+
headers: {
|
| 247 |
+
'Content-Type': 'application/json',
|
| 248 |
+
...corsHeaders(origin),
|
| 249 |
+
},
|
| 250 |
+
}
|
| 251 |
+
);
|
| 252 |
+
}
|
| 253 |
+
|
| 254 |
+
if (request.method !== 'GET' && request.method !== 'HEAD') {
|
| 255 |
+
return new Response('Method not allowed', {
|
| 256 |
+
status: 405,
|
| 257 |
+
headers: corsHeaders(origin),
|
| 258 |
+
});
|
| 259 |
+
}
|
| 260 |
+
|
| 261 |
+
if (env.RATE_LIMIT_KV) {
|
| 262 |
+
const clientIP = request.headers.get('CF-Connecting-IP') || 'unknown';
|
| 263 |
+
const rateLimitKey = `wasm-ratelimit:${clientIP}`;
|
| 264 |
+
const now = Date.now();
|
| 265 |
+
|
| 266 |
+
const rateLimitData = await env.RATE_LIMIT_KV.get(rateLimitKey, {
|
| 267 |
+
type: 'json',
|
| 268 |
+
});
|
| 269 |
+
const requests = rateLimitData?.requests || [];
|
| 270 |
+
const recentRequests = requests.filter(
|
| 271 |
+
(t) => now - t < RATE_LIMIT_WINDOW_MS
|
| 272 |
+
);
|
| 273 |
+
|
| 274 |
+
if (recentRequests.length >= RATE_LIMIT_MAX_REQUESTS) {
|
| 275 |
+
return new Response(
|
| 276 |
+
JSON.stringify({
|
| 277 |
+
error: 'Rate limit exceeded',
|
| 278 |
+
message: `Maximum ${RATE_LIMIT_MAX_REQUESTS} requests per minute.`,
|
| 279 |
+
}),
|
| 280 |
+
{
|
| 281 |
+
status: 429,
|
| 282 |
+
headers: {
|
| 283 |
+
...corsHeaders(origin),
|
| 284 |
+
'Content-Type': 'application/json',
|
| 285 |
+
'Retry-After': '60',
|
| 286 |
+
},
|
| 287 |
+
}
|
| 288 |
+
);
|
| 289 |
+
}
|
| 290 |
+
|
| 291 |
+
recentRequests.push(now);
|
| 292 |
+
await env.RATE_LIMIT_KV.put(
|
| 293 |
+
rateLimitKey,
|
| 294 |
+
JSON.stringify({ requests: recentRequests }),
|
| 295 |
+
{
|
| 296 |
+
expirationTtl: 120,
|
| 297 |
+
}
|
| 298 |
+
);
|
| 299 |
+
}
|
| 300 |
+
|
| 301 |
+
if (pathname.startsWith('/pymupdf/')) {
|
| 302 |
+
const subpath = pathname.replace('/pymupdf', '');
|
| 303 |
+
return proxyRequest(request, env, env.PYMUPDF_SOURCE, subpath, origin);
|
| 304 |
+
}
|
| 305 |
+
|
| 306 |
+
if (pathname.startsWith('/gs/')) {
|
| 307 |
+
const subpath = pathname.replace('/gs', '');
|
| 308 |
+
return proxyRequest(request, env, env.GS_SOURCE, subpath, origin);
|
| 309 |
+
}
|
| 310 |
+
|
| 311 |
+
if (pathname.startsWith('/cpdf/')) {
|
| 312 |
+
const subpath = pathname.replace('/cpdf', '');
|
| 313 |
+
return proxyRequest(request, env, env.CPDF_SOURCE, subpath, origin);
|
| 314 |
+
}
|
| 315 |
+
|
| 316 |
+
if (pathname === '/' || pathname === '/health') {
|
| 317 |
+
return new Response(
|
| 318 |
+
JSON.stringify({
|
| 319 |
+
service: 'BentoPDF WASM Proxy',
|
| 320 |
+
version: '1.0.0',
|
| 321 |
+
endpoints: {
|
| 322 |
+
pymupdf: '/pymupdf/*',
|
| 323 |
+
gs: '/gs/*',
|
| 324 |
+
cpdf: '/cpdf/*',
|
| 325 |
+
},
|
| 326 |
+
configured: {
|
| 327 |
+
pymupdf: !!env.PYMUPDF_SOURCE,
|
| 328 |
+
gs: !!env.GS_SOURCE,
|
| 329 |
+
cpdf: !!env.CPDF_SOURCE,
|
| 330 |
+
},
|
| 331 |
+
}),
|
| 332 |
+
{
|
| 333 |
+
status: 200,
|
| 334 |
+
headers: {
|
| 335 |
+
...corsHeaders(origin),
|
| 336 |
+
'Content-Type': 'application/json',
|
| 337 |
+
},
|
| 338 |
+
}
|
| 339 |
+
);
|
| 340 |
+
}
|
| 341 |
+
|
| 342 |
+
return new Response(
|
| 343 |
+
JSON.stringify({
|
| 344 |
+
error: 'Not Found',
|
| 345 |
+
message: 'Use /pymupdf/*, /gs/*, or /cpdf/* endpoints',
|
| 346 |
+
}),
|
| 347 |
+
{
|
| 348 |
+
status: 404,
|
| 349 |
+
headers: {
|
| 350 |
+
...corsHeaders(origin),
|
| 351 |
+
'Content-Type': 'application/json',
|
| 352 |
+
},
|
| 353 |
+
}
|
| 354 |
+
);
|
| 355 |
+
},
|
| 356 |
+
};
|
cloudflare/wasm-wrangler.toml
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name = "bentopdf-wasm-proxy"
|
| 2 |
+
main = "wasm-proxy-worker.js"
|
| 3 |
+
compatibility_date = "2024-01-01"
|
| 4 |
+
|
| 5 |
+
# =============================================================================
|
| 6 |
+
# DEPLOYMENT
|
| 7 |
+
# =============================================================================
|
| 8 |
+
# Deploy this worker:
|
| 9 |
+
# cd cloudflare
|
| 10 |
+
# npx wrangler deploy -c wasm-wrangler.toml
|
| 11 |
+
#
|
| 12 |
+
# Set environment secrets (one of the following methods):
|
| 13 |
+
# Option A: Cloudflare Dashboard
|
| 14 |
+
# Go to Workers & Pages > bentopdf-wasm-proxy > Settings > Variables
|
| 15 |
+
# Add: PYMUPDF_SOURCE, GS_SOURCE, CPDF_SOURCE
|
| 16 |
+
#
|
| 17 |
+
# Option B: Wrangler CLI
|
| 18 |
+
# npx wrangler secret put PYMUPDF_SOURCE -c wasm-wrangler.toml
|
| 19 |
+
# npx wrangler secret put GS_SOURCE -c wasm-wrangler.toml
|
| 20 |
+
# npx wrangler secret put CPDF_SOURCE -c wasm-wrangler.toml
|
| 21 |
+
|
| 22 |
+
# =============================================================================
|
| 23 |
+
# WASM SOURCE URLS
|
| 24 |
+
# =============================================================================
|
| 25 |
+
# Set these as secrets in the Cloudflare dashboard or via wrangler:
|
| 26 |
+
#
|
| 27 |
+
# PYMUPDF_SOURCE: Base URL to PyMuPDF WASM files
|
| 28 |
+
# Example: https://cdn.jsdelivr.net/npm/@bentopdf/pymupdf-wasm/assets
|
| 29 |
+
# https://your-bucket.r2.cloudflarestorage.com/pymupdf
|
| 30 |
+
#
|
| 31 |
+
# GS_SOURCE: Base URL to Ghostscript WASM files
|
| 32 |
+
# Example: https://cdn.jsdelivr.net/npm/@bentopdf/gs-wasm/assets
|
| 33 |
+
# https://your-bucket.r2.cloudflarestorage.com/gs
|
| 34 |
+
#
|
| 35 |
+
# CPDF_SOURCE: Base URL to CoherentPDF files
|
| 36 |
+
# Example: https://cdn.jsdelivr.net/npm/coherentpdf/cpdf
|
| 37 |
+
# https://your-bucket.r2.cloudflarestorage.com/cpdf
|
| 38 |
+
|
| 39 |
+
# =============================================================================
|
| 40 |
+
# USAGE FROM BENTOPDF
|
| 41 |
+
# =============================================================================
|
| 42 |
+
# In BentoPDF's WASM Settings page, configure URLs like:
|
| 43 |
+
# PyMuPDF: https://wasm.bentopdf.com/pymupdf/
|
| 44 |
+
# Ghostscript: https://wasm.bentopdf.com/gs/
|
| 45 |
+
# CoherentPDF: https://wasm.bentopdf.com/cpdf/
|
| 46 |
+
|
| 47 |
+
# =============================================================================
|
| 48 |
+
# RATE LIMITING (Optional but recommended)
|
| 49 |
+
# =============================================================================
|
| 50 |
+
# Create KV namespace:
|
| 51 |
+
# npx wrangler kv namespace create "RATE_LIMIT_KV"
|
| 52 |
+
#
|
| 53 |
+
# Then uncomment and update the ID below:
|
| 54 |
+
# [[kv_namespaces]]
|
| 55 |
+
# binding = "RATE_LIMIT_KV"
|
| 56 |
+
# id = "<YOUR_KV_NAMESPACE_ID>"
|
| 57 |
+
|
| 58 |
+
# Use the same KV namespace as the CORS proxy if you want shared rate limiting
|
| 59 |
+
[[kv_namespaces]]
|
| 60 |
+
binding = "RATE_LIMIT_KV"
|
| 61 |
+
id = "b88e030b308941118cd484e3fcb3ae49"
|
| 62 |
+
|
| 63 |
+
# =============================================================================
|
| 64 |
+
# CUSTOM DOMAIN (Optional)
|
| 65 |
+
# =============================================================================
|
| 66 |
+
# If you want a custom domain like wasm.bentopdf.com:
|
| 67 |
+
# routes = [
|
| 68 |
+
# { pattern = "wasm.bentopdf.com/*", zone_name = "bentopdf.com" }
|
| 69 |
+
# ]
|
cloudflare/wrangler.toml
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name = "bentopdf-cors-proxy"
|
| 2 |
+
main = "cors-proxy-worker.js"
|
| 3 |
+
compatibility_date = "2024-01-01"
|
| 4 |
+
|
| 5 |
+
# Deploy to Cloudflare's global network
|
| 6 |
+
# If you are self hosting change the name to your worker name
|
| 7 |
+
# Run: npx wrangler deploy
|
| 8 |
+
|
| 9 |
+
# =============================================================================
|
| 10 |
+
# SECURITY FEATURES
|
| 11 |
+
# =============================================================================
|
| 12 |
+
#
|
| 13 |
+
# 1. SIGNATURE VERIFICATION (Optional - for anti-spoofing)
|
| 14 |
+
# - Generate secret: openssl rand -hex 32
|
| 15 |
+
# - Set secret: npx wrangler secret put PROXY_SECRET
|
| 16 |
+
# - Note: Secret is visible in frontend JS, so provides limited protection
|
| 17 |
+
#
|
| 18 |
+
# 2. RATE LIMITING (Recommended - requires KV)
|
| 19 |
+
# - Create KV namespace: npx wrangler kv namespace create "RATE_LIMIT_KV"
|
| 20 |
+
# - Uncomment the kv_namespaces section below with the returned ID
|
| 21 |
+
# - Limits: 60 requests per IP per minute
|
| 22 |
+
#
|
| 23 |
+
# 3. FILE SIZE LIMIT
|
| 24 |
+
# - Automatic: Rejects files larger than 1MB
|
| 25 |
+
# - Certificates are typically <10KB, so this prevents abuse
|
| 26 |
+
#
|
| 27 |
+
# 4. URL RESTRICTIONS
|
| 28 |
+
# - Only certificate URLs allowed (*.crt, *.cer, *.pem, /certs/, etc.)
|
| 29 |
+
# - Blocks private IPs (localhost, 10.x, 192.168.x, 172.16-31.x)
|
| 30 |
+
|
| 31 |
+
# =============================================================================
|
| 32 |
+
# KV NAMESPACE FOR RATE LIMITING
|
| 33 |
+
# =============================================================================
|
| 34 |
+
[[kv_namespaces]]
|
| 35 |
+
binding = "RATE_LIMIT_KV"
|
| 36 |
+
id = "b88e030b308941118cd484e3fcb3ae49"
|
| 37 |
+
|
| 38 |
+
# Optional: Custom domain routing
|
| 39 |
+
# routes = [
|
| 40 |
+
# { pattern = "cors-proxy.bentopdf.com/*", zone_name = "bentopdf.com" }
|
| 41 |
+
# ]
|
| 42 |
+
|
| 43 |
+
# Optional: Environment variables (for non-secret config)
|
| 44 |
+
# [vars]
|
| 45 |
+
# ALLOWED_ORIGINS = "https://www.bentopdf.com,https://bentopdf.com"
|