A read-only data proxy for Source Cooperative, built as a Cloudflare Worker in Rust. It translates Source Cooperative URL paths into requests against cloud storage backends (S3, Azure Blob Storage, GCS) using the multistore S3 gateway.
The proxy supports GET, HEAD, and S3-compatible LIST operations with anonymous access, and resolves storage backends dynamically via the Source Cooperative API.
rustup target add wasm32-unknown-unknown
cargo install worker-build@0.7.5
npm install -g wrangler@3wrangler devThe proxy will be available at http://localhost:8787.
Unit tests for routing and pagination run on native targets:
cargo testIntegration tests run against a live instance:
wrangler dev &
python -m pytest tests/test_integration.py# Version info
curl http://localhost:8787/
# List products for an account
curl "http://localhost:8787/cholmes?list-type=2&delimiter=/"
# List files in a product
curl "http://localhost:8787/cholmes/admin-boundaries?list-type=2&prefix=&max-keys=10"
# Get object metadata
curl -I http://localhost:8787/cholmes/admin-boundaries/countries.parquet
# Download with range request
curl -r 0-1023 http://localhost:8787/cholmes/admin-boundaries/countries.parquet -o chunk.binThe proxy rewrites Source Cooperative URL paths (/{account}/{product}/{key}) into multistore's virtual bucket model, resolving storage backends dynamically via the Source Cooperative API.
Client Request: GET /{account}/{product}/{key}
│
├─ [routing] Parse path, rewrite to bucket={account}:{product}, key={key}
├─ [registry] Resolve backend via Source API (product metadata + data connections)
├─ [cache] Cache API responses (products: 5min, data connections: 30min, listings: 1min)
├─ [multistore ProxyGateway] Generate presigned URL for the resolved storage backend
└─ Stream response back to client
| Module | Purpose |
|---|---|
src/lib.rs |
Fetch handler, OIDC discovery endpoints, request routing, CORS |
src/routing.rs |
Request classification and path rewriting |
src/registry.rs |
Source Cooperative API client and backend resolution |
src/cache.rs |
Cloudflare Cache API wrapper with per-datatype TTLs |
src/pagination.rs |
S3-compatible pagination for prefix listings |
src/analytics.rs |
Cloudflare Analytics Engine request logging |
src/handlers.rs |
Custom route handlers (index, account listing) |
| Operation | Description |
|---|---|
GET / |
Version info |
GET /{account}?list-type=2 |
List products for an account |
GET /{account}/{product}?list-type=2 |
List objects in a product (S3-compatible, with pagination) |
GET /{account}/{product}/{key} |
Download an object (supports range requests) |
HEAD /{account}/{product}/{key} |
Get object metadata |
OPTIONS * |
CORS preflight |
GET /.well-known/openid-configuration |
OIDC discovery document |
GET /.well-known/jwks.json |
JSON Web Key Set for JWT verification |
Write operations (PUT, POST, DELETE, PATCH) return 405 Method Not Allowed.
Set in wrangler.toml or via the Cloudflare dashboard:
| Variable | Default | Description |
|---|---|---|
SOURCE_API_URL |
https://source.coop |
Source Cooperative API base URL |
LOG_LEVEL |
WARN |
Tracing level (TRACE, DEBUG, INFO, WARN, ERROR) |
AUTH_ISSUER |
https://auth.source.coop |
OIDC issuer trusted for /.sts token exchange |
AUTH_AUDIENCE |
— | OAuth client ID that /.sts subject tokens must be issued to (aud claim). Unset = /.sts token exchange is disabled (returns 501) |
OIDC_PROVIDER_ISSUER |
https://data.source.coop |
Issuer URL for minted JWTs and OIDC discovery |
OIDC_PROVIDER_KID |
data-proxy-1 |
Key ID for the active signing key |
OIDC_PROVIDER_KID_PREVIOUS |
— | Key ID for the previous key (during rotation) |
GitHub environment secrets are the source of truth. The deploy workflow
(.github/workflows/deploy.yml) re-uploads them via wrangler secret bulk on
every deploy, so a value set directly with wrangler secret put is silently
overwritten by the next deploy. Set them per GitHub environment (preview,
staging, production) with gh secret set <NAME> --env <environment> or in
the repository settings UI.
| Secret | Description |
|---|---|
OIDC_PROVIDER_KEY |
PEM-encoded PKCS#8 RSA private key for JWT signing |
OIDC_PROVIDER_KEY_PREVIOUS |
Previous RSA key (optional, during rotation) |
SESSION_TOKEN_KEY |
Base64-encoded 32-byte AES key sealing STS credentials |
The proxy authenticates to the Source Cooperative API via minting short-lived JWT (60s TTL) signed with its RSA key and sending it as a Bearer token
When configured with an RSA key, the proxy acts as its own OpenID Connect identity provider. It serves standard discovery endpoints so that relying parties (such as the Source Cooperative API) can verify its JWTs.
GET /.well-known/openid-configuration— discovery document containing issuer and JWKS URIGET /.well-known/jwks.json— public key(s) for JWT signature verification
These endpoints are only active when OIDC_PROVIDER_KEY is configured.
Generate an RSA key pair, store it as a GitHub environment secret, and deploy:
openssl genpkey -algorithm RSA -pkeyopt rsa_keygen_bits:2048 -out oidc-key.pem
gh secret set OIDC_PROVIDER_KEY --env production < oidc-key.pemVerify the endpoints:
curl https://data.source.coop/.well-known/openid-configuration
curl https://data.source.coop/.well-known/jwks.jsonThe proxy supports zero-downtime key rotation by serving both active and previous public keys in the JWKS endpoint. Only the active key signs new tokens.
Because CI re-uploads secrets from GitHub on every deploy, rotation happens
through GitHub environment secrets and wrangler.toml, not wrangler secret put
(which the next deploy would silently revert).
Rotation procedure (repeat per environment — preview, staging, production):
-
Generate a new RSA key with a new key ID:
openssl genpkey -algorithm RSA -pkeyopt rsa_keygen_bits:2048 -out new-key.pem
-
Move the current key to the previous slot and stage the new key:
# Copy the CURRENT key PEM into the previous slot gh secret set OIDC_PROVIDER_KEY_PREVIOUS --env production < current-key.pem gh secret set OIDC_PROVIDER_KEY --env production < new-key.pem
In
wrangler.toml, setOIDC_PROVIDER_KID_PREVIOUSto the currentOIDC_PROVIDER_KIDvalue and setOIDC_PROVIDER_KIDto the new key ID. Merge and let CI deploy. -
The JWKS endpoint now serves both keys. Wait for relying parties to refresh their JWKS cache.
-
Remove the previous key:
gh secret delete OIDC_PROVIDER_KEY_PREVIOUS --env production # The worker-side copy isn't removed by CI, so delete it once manually: wrangler secret delete OIDC_PROVIDER_KEY_PREVIOUSRemove
OIDC_PROVIDER_KID_PREVIOUSfromwrangler.toml, merge, and let CI deploy.
SESSION_TOKEN_KEY seals the temporary credentials issued by /.sts. There is
no dual-key window: rotating it immediately invalidates every outstanding sealed
credential (e.g. the frontend's sc_proxy_creds cookies), and affected clients
must redo the token exchange. To rotate:
openssl rand -base64 32 | gh secret set SESSION_TOKEN_KEY --env productionThe next deploy picks it up.
See docs/plans/ for architecture and design documents.