GDP pygeoapi
Overview
The USGS Geo Data Portal (GDP) exposes gridded climate and meteorological data
through a pygeoapi
server at
https://api.water.usgs.gov/gdp/pygeoapi. The service implements several
complementary OGC API
standards, each suited to
different access patterns:
- OGC API – EDR — a convenience interface for extracting values at a point or within a bounding box from a curated subset of datasets, without requiring a full Zarr-capable environment
- STAC catalog — the primary interface for dataset discovery and production workflows; provides cloud-native access (Zarr, COG) for use with xarray, Dask, and similar tooling
- OGC API – Processes — server-side geoprocessing (area-weighted statistics, spatial interpolation) powered by gdptools
The EDR collections are a curated subset of datasets also available through the STAC catalog. If you need an EDR endpoint for a dataset that is in the STAC catalog but not currently exposed as EDR, contact the team .
If you are new to OGC API: it is a family of RESTful standards that replace older
OGC services (WMS, WFS, WCS) with modern JSON/HTTP patterns. Every endpoint
returns standard JSON by default and supports content negotiation via a ?f=
parameter — more on that below.
EDR — Small-Scale Data Access
OGC API – Environmental Data Retrieval (EDR) is a standard for querying environmental data by geographic position or spatial extent. Use it when you need to extract values at a point or within a bounding box and a full Zarr-capable data environment is not warranted.
The GDP EDR collections are gridded CONUS datasets covering precipitation,
temperature, wind speed, evapotranspiration, and hydrological model output.
Browse the current list at
/collections
.
This service supports two EDR query types:
- Position (
/collections/{id}/position) — extract values at a single coordinate (longitude, latitude) over a time range - Cube (
/collections/{id}/cube) — extract values within a bounding box and time range
See the OGC API – EDR specification for the full standard.
Example: Position Query
Extract monthly precipitation at a point in southern Wisconsin for 2012 from the PRISM collection:
GET https://api.water.usgs.gov/gdp/pygeoapi/collections/PRISM/position?coords=POINT(-89.4%2043.07)¶meter-name=ppt&datetime=2012-01-01/2012-12-31&f=json
| Parameter | Value | Meaning |
|---|---|---|
coords |
POINT(-89.4 43.07) |
WGS 84 longitude/latitude (longitude first) |
parameter-name |
ppt |
Variable to extract — PRISM also offers tmn (min temp, °C) and tmx (max temp, °C) |
datetime |
2012-01-01/2012-12-31 |
ISO 8601 date range |
f |
json |
Response format |
The response is a CoverageJSON document containing a time series of monthly values at the requested point.
To discover the available parameter names and temporal extent for any collection,
inspect its metadata at
/collections/{id}?f=json
.
Example: Cube Query
The cube query extracts gridded values within a bounding box rather than at a
single point:
GET https://api.water.usgs.gov/gdp/pygeoapi/collections/PRISM/cube?bbox=-100,40,-98,42¶meter-name=ppt&datetime=2012-01-01/2012-06-01&f=json
The bbox parameter takes four values: west, south, east, north (in decimal
degrees, WGS 84).
STAC Catalog — Data Discovery and Larger Workflows
The SpatioTemporal Asset Catalog (STAC) interface provides searchable metadata for a broad collection of water-related datasets — climate models, downscaled projections, evapotranspiration, land cover, and more. Use it when you need to discover datasets and access data in bulk or via cloud-native formats (Zarr, COG) in environments with xarray, Dask, or similar tooling.
The JSON landing page does not link to the STAC catalog. Access it directly at
/stac
.
See the STAC specification for the full standard.
When to Use STAC vs. EDR
- EDR is best for quick extraction — pull values at a point or small bounding box without setting up a cloud-native data environment. No xarray or Dask required; just an HTTP client.
- STAC is best for dataset discovery, bulk access, and production workflows where you need cloud-native formats (Zarr, COG) and tools like xarray or Dask.
The EDR collections are a curated subset of the broader STAC catalog. If a dataset you need is available in the STAC catalog but not exposed as an EDR collection, let us know — additional EDR endpoints can be added on request.
OGC API – Processes and gdptools
The Processes endpoint exposes server-side geoprocessing via the OGC API – Processes standard.
These processes are underpinned by gdptools , a Python package for calculating area-weighted statistics and spatial interpolations between gridded datasets and vector geometries. Key capabilities include:
- Grid-to-polygon aggregation — aggregate raster data over polygon boundaries
- Grid-to-line interpolation — sample gridded data along vector paths
- Polygon-to-polygon transfer — convert data between spatial frameworks with proper handling of extensive vs. intensive variables
Available Processes
| Process | Description | Explore |
|---|---|---|
GDP-CalcWeightsStac |
Generate grid-to-polygon intersection weights using NHGF STAC metadata | JSON |
GDP-CalcAggStac |
Apply area-weighted aggregation using STAC metadata and pre-computed weights | JSON |
GDP-CalcWeightsClimr |
Generate grid-to-polygon intersection weights using ClimateR catalog (OpenDAP) | JSON |
GDP-CalcAggClimr |
Apply area-weighted aggregation using ClimateR metadata and pre-computed weights | JSON |
The current process list can always be discovered at
/processes?f=json
.
Two-Step Workflow
The processes follow a weights → aggregation pattern:
- CalcWeights computes the spatial intersection weights between a gridded dataset and user-supplied polygons (a GeoJSON FeatureCollection).
- CalcAgg applies those weights to aggregate time-series data from the grid onto the polygons.
Two catalog variants are available: the *Stac processes resolve datasets
through the NHGF STAC catalog, while the *Climr processes use the ClimateR
OpenDAP catalog. See the
gdptools documentation
for
algorithmic details.
Example: Asynchronous Process Execution
Processes support both asynchronous and synchronous execution. For larger jobs use the async pattern — submit, poll, retrieve.
Step 1 — Submit the job. POST to the process execution endpoint with a
Prefer: respond-async header:
POST https://api.water.usgs.gov/gdp/pygeoapi/processes/GDP-CalcWeightsStac/execution?f=json
Content-Type: application/json
Prefer: respond-async
{
"inputs": {
"collection_id": "conus404_daily",
"variables": "[\"T2\"]",
"shape_file": "{ ... GeoJSON FeatureCollection ... }",
"shape_crs": "EPSG:4326",
"shape_poly_idx": "id",
"wght_gen_proj": "EPSG:6931",
"start_date": "1999-01-01",
"end_date": "1999-01-07"
}
}
The shape_file value is a GeoJSON FeatureCollection containing the polygons
for aggregation. See the
Asynchronous Processing use case
for a complete working example.
The server returns 201 Created with a Location header pointing to the job.
Step 2 — Poll for status.
GET https://api.water.usgs.gov/gdp/pygeoapi/jobs/{jobId}?f=json
The status field progresses through accepted → running → successful
(or failed).
Step 3 — Retrieve results.
GET https://api.water.usgs.gov/gdp/pygeoapi/jobs/{jobId}/results?f=json
For smaller jobs, omit the Prefer header to execute synchronously — results
are returned directly in the response.
The process descriptions at
/processes/{id}?f=json
provide the complete input schema and are the authoritative parameter reference.
Browse available processes at
/processes
.
Using the Built-In Developer Tools
The HTML views that pygeoapi generates serve as browser-based API introspection tools for developers during integration and testing, not as a general-purpose user interface. They render the same data the JSON API returns, formatted for human reading.
Every endpoint supports a ?f= parameter for content negotiation:
?f=json— machine-readable JSON (default for programmatic access)?f=html— human-readable HTML view (auto-generated by pygeoapi)
Useful developer entry points:
- Swagger UI —
/openapi?f=html— interactive endpoint testing; try queries and inspect request/response shapes - HTML collection browser — append
?f=htmlto any endpoint (e.g./collections/PRISM?f=html) to browse metadata and query options - STAC HTML view —
/stac?f=html— browse the full dataset catalog
Key Endpoints
| Path | Purpose |
|---|---|
/ |
Landing page (service metadata) |
/collections |
EDR dataset listing |
/collections/{id}/position |
Point query for a collection |
/collections/{id}/cube |
Bounding-box/time query |
/stac |
STAC catalog root |
/processes |
Available geoprocessing operations |
/processes/{id}/execution |
Submit a process job |
/jobs |
List all jobs |
/jobs/{jobId} |
Job status |
/jobs/{jobId}/results |
Job results |
/conformance |
Supported OGC standards |
/openapi |
OpenAPI specification (Swagger / ReDoc) |
All paths are relative to https://api.water.usgs.gov/gdp/pygeoapi.
Related Use Cases
Worked examples demonstrating GDP pygeoapi workflows are available at water.usgs.gov/catalog/usecases :
- Environmental Data Retrieval (EDR) — accessing spatio-temporal data via the EDR API
- WMA STAC Catalog — discovering and reading STAC assets into a Python workflow
- Asynchronous Processing via OGC API Processes — executing gdptools processes asynchronously via pygeoapi
- gdptools + CONUS404 — spatially aggregating CONUS404 data to HUC12 polygons
- Regridding gridMET to HUC12s — area-weighted climate forcing aggregation
- Dockerized pygeoapi Deployment — deploying custom data services through pygeoapi