Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

GDROM v2: An Inventory of Operation Variables Time Series and Rules for 2,017 Large Reservoirs across the CONUS


Authors:
Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource.
Type: Resource
Storage: The size of this resource is 1.8 GB
Created: Jul 21, 2025 at 8:04 p.m. (UTC)
Last updated: Sep 29, 2025 at 2:41 p.m. (UTC)
Published date: Sep 29, 2025 at 2:43 p.m. (UTC)
DOI: 10.4211/hs.5293674cb83b4ec698db0eb4777467b8
Citation: See how to cite this resource
Content types: CSV Content 
Sharing Status: Published
Views: 926
Downloads: 54
+1 Votes: 1 other +1 this
Comments: No comments (yet)

Abstract

Effective reservoir operation is critical for achieving multiple water management objectives, while also representing a major human intervention in hydrologic processes. Yet for most reservoirs, the lack of complete daily operation records and realistic accurate operation rules constrains their representation in large-scale hydrological models. To address this gap, we present GDROM v2, a nationwide dataset covering daily operation variable time series and derived operation rules for 2,017 large reservoirs across the Contiguous United States (CONUS), building upon the original GDROM (Li et al., 2023. Hydroshare. https://doi.org/10.4211/hs.63add4d5826a4b21a6546c571bdece10) of 452 reservoirs.
The dataset includes:
1. Time series of daily inflow, release, and storage (collected, cleaned, and normalized)
2. Operation rules for each reservoir expressed as “if–then–else” statements
3. Python scripts and instructions for using and training GDROMs
GDROM v2 offers the largest open collection of reservoir operation variable time series and realistic operation rules across the CONUS, providing a useful resource for hydrological modeling and water management studies.

Subject Keywords

Coverage

Spatial

Coordinate System/Geographic Projection:
WGS 84 EPSG:4326
Coordinate Units:
Decimal degrees
North Latitude
49.1586°
East Longitude
-59.9325°
South Latitude
24.3749°
West Longitude
-125.6747°

Temporal

Start Date:
End Date:

Content

readme.md

Overview of the dataset

Effective reservoir operation is critical for achieving multiple water management objectives, while also representing a major human intervention in hydrologic processes. Yet for most reservoirs, the lack of complete daily operation records and accurate operation rules constrains their representation in large-scale hydrological models. To address this gap, we present GDROM v2, a nationwide dataset covering daily operation data and derived operation rules for 2,017 reservoirs across the Contiguous United States (CONUS), building upon the original GDROM (Li et al., 2023) of 452 reservoirs. GDROM v2 provides (1) daily time series of inflow, release, and storage variables integrating historical reservoir operation records from GDROM (Li et al., 2023), ResOpsUS (Steyaert et al., 2022), USACE (2025) and reconstructed through a data-fusion framework that combines USGS streamflow observations (Hodson et al., 2023), SARAH-CONUS remote sensing reservoir surface (Yadav et al., 2025) and GRDL reservoir bathymetry (Hao et al., 2023); (2) reservoir operation rules derived from GDROM (Chen et al., 2022) and transfer learning (Zheng et al., 2025). GDROM v2 offers the largest open collection of reservoir operation time series and rules across the CONUS, providing a robust empirical basis and scalable framework for hydrological modeling and water management studies.

(1) For daily time series variables, reservoirs are categorized into:

  • Data-rich (Res-R): 748 reservoirs with >5 years of high-quality daily inflow, release and storage data, directly used for GDROM training.
  • Data-limited (Res-L): 174 reservoirs with <5 years of daily inflow, release and storage data. GDROMs are transferred and fine-tuned from the most analogous Res-R.
  • Data-missing (Res-M): 1,095 reservoirs, 203 have none of the three variables available (i.e., inflow, release, and storage), while the remaining reservoirs have either only one variable or two non-overlapping variables. GDROMs are transferred from the most analogous Res-R.

(2) For reservoir operation rules, each reservoir is composed of two components:

  • Modules: Simulate daily release based on daily inflow and storage. For Res-R, modules are trained by HMDT, as introduced by Zhao and Cai (2020). For Res-L, the functional form of each module follows predefined Five-Type release structures, including constant release, inflow- or storage-driven patterns, as introduced in Li et al. (2024). For Res-M, the modules are directly copied from the most analogous Res-R reservoirs.
  • Module Conditions: Define the conditions under which a specific module to be used, based on daily inflow, initial storage, day of year (DOY), and Palmer Drought Severity Index (PDSI; NOAA, 2025). The mapping from state variables to module index is captured by CART, as detailed in Chen et al. (2022). For reservoirs with a single module, this module condition file is not required and provided.

Description of the dataset

readme.md

  • This document describes the structure, metadata, and usage of GDROM v2.

reservoir_metadata.csv

  • This metadata file documents key attributes for all 2,017 reservoirs. Each row represents a unique reservoir, and columns describe specific physical, geographic, operational, or modeling-related properties. Most physical, geographic, operational properties are sourced from the Global Reservoir and Dam (GRanD) database (Lehner et al., 2011), the remaining are sourced from the National Inventory of Dams (NID) database (USACE, 2025). The metadata schema is detailed below:

  • GRAND_ID: The unique identifier assigned to each reservoir, adopted from the GRanD database. For reservoirs not listed in GRanD, an ID is assigned starting from 10000.

  • RES_NAME / DAM_NAME: The name of the reservoir and its associated dam.
  • ADMIN_UNIT / STATE: The U.S. state where the reservoir is located, given in full (ADMIN_UNIT) and abbreviated (STATE) forms.
  • STORAGE_MAX: The maximum observed storage during the available record period, in unit of acre-feet. For reservoirs lacking observed storage records, estimates are obtained using the Global Reservoir Storage (GRS; Li et al., 2023).
  • USE_IRRI, USE_ELEC, USE_SUPP, USE_FCON, USE_RECR, USE_NAVI, USE_FISH, USE_PCON, USE_LIVE, USE_OTHR: Operational use priority indicators across ten standard water use categories, where "Main" indicates the primary purpose, "Sec" indicates secondary purpose, and a blank cell means the reservoir is not used for that purpose.
  • MAIN_USE: The main use of the reservoir.
  • LONGITUDE / LATITUDE: Geographic coordinates (in decimal degrees) representing the location of the dam.
  • YEAR_RANGE: The range of years (e.g., 2005–2019) for which valid daily records are available for the reservoir.
  • INFLOW / RELEASE / STORAGE_LENGTH: The number of daily inflow / outflow / storage data, in unit of acre-feet/day.
  • INFLOW / RELEASE / STORAGE_SOURCE: The source used to collect daily inflow / outflow / storage data.
  • TS_LENGTH: The total number of days in the complete time series for the reservoir.
  • CATEGORY: The classification of each reservoir based on data availability.
  • MODULE_NUMBER: The number of HMDT modules used for simulating release behavior.
  • NSE / PBIAS: Model evaluation metrics—Nash-Sutcliffe Efficiency and Percent Bias—which are used to assess model performance for Res-R and Res-L.

Time series of reservoir variables/

  • This folder contains all the time series of reservoir variables.

Time series of reservoir variables/collected data for all reservoirs

  • Raw historical data collected and merged.
  • The records of each reservoir are stored in a CSV file named as reservoirID.csv.

Time series of reservoir variables/cleaned data for Res-R & Res-L

  • Historical operation data after data cleaning.
  • The records of each reservoir are stored in a CSV file named as reservoirID.csv.

Time series of reservoir variables/normalized data for Res-R & Res-L

  • Normalized historical operation data used for model training.
  • The records of each reservoir are stored in a CSV file named as reservoirID.csv.

operation_rule/

  • This folder contains the extracted operation rules for each reservoir.
  • Two sub-folders: modules/ and module_conditions/, storing the representative operation modules and module conditions, respectively.

operation_rule/modules/

  • Extracted operation modules for each reservoir.
  • Each reservoir may have one or more modules, named as reservoirID_moduleID.txt, with module IDs starting from 0.
  • Each operation module is written as a set of "if-then-else" statements, with inflow and storage as inputs and release as outputs.

operation_rule/module_conditions/

  • Module conditions for reservoirs with multiple modules.
  • Each applicable reservoir has one condition file, named as reservoirID.txt, describing the conditions under which each module is used.
  • If a reservoir has only one module, no condition file is needed.
  • Each condition file is written as a set of “if–then–else” statements, with inflow, storage, DOY, and PDSI as inputs, and the corresponding module IDs as outputs.

script/

  • This folder contains the main Python scripts for running or training GDROM models.

script/Environment

  • Contains instructions for setting up the Python environment required for training or running GDROM models.
  • Please refer to environment.md in this folder for detailed setup guidance.

script/Reference Res-R

  • Contains the data and operation rules needed for transfer learning from Res-R to Res-L and Res-M.

script/GDROM_Res_R.py

  • Script for training GDROM models on Res-R reservoirs.

script/GDROM_Res_L.py

  • Script for training GDROM models on Res-L reservoirs.

script/GDROM_Res_M.py

  • Script for training GDROM models on Res-M reservoirs.

script/rule2model.py

  • Script for applying existing rule-based models directly to reservoirs.
  • Suitable for users who only want to simulate using pretrained models.

script/other .py files

  • Contains auxiliary Python modules with shared functions used across all GDROM scripts.

Citations

  • Chen, Y., Li, D., Zhao, Q., & Cai, X. (2022). Developing a generic data-driven reservoir operation model. Advances in Water Resources, 167, 104274. https://doi.org/10.1016/j.advwatres.2022.104274
  • Hao, Z., Chen, F., Jia, X., Cai, X., Yang, C., Du, Y., & Ling, F. (2024). GRDL: A new global reservoir area-storage-depth data set derived through deep learning-based bathymetry reconstruction. Water Resources Research, 60, e2023WR035781. https://doi.org/10.1029/2023WR035781
  • Hodson, T. O., Hariharan, J. A., Black, S., & Horsburgh, J. S. (2023). dataretrieval (Python): a Python package for discovering and retrieving water data available from U.S. federal hydrologic web services. U.S. Geological Survey software release. https://doi.org/10.5066/P94I5TX3
  • Lehner, B., Reidy Liermann, C., Revenga, C., Vörösmarty, C., Fekete, B., Crouzet, P., Döll, P., Endejan, M., Frenken, K., Magome, J., Nilsson, C., Robertson, J. C., Rodel, R., Sindorf, N., & Wisser, D. (2011). High-resolution mapping of the world’s reservoirs and dams for sustainable river-flow management. Frontiers in Ecology and the Environment, 9(9), 494–502. https://www.globaldamwatch.org/grand
  • Li, D., Chen, Y., Cai, X., & Zhao, Q. (2023). Data-driven Reservoir Operation Rules for 450+ Reservoirs in Contiguous United States. HydroShare. https://doi.org/10.4211/hs.63add4d5826a4b21a6546c571bdece10
  • Li, D., Chen, Y., Lyu, L., & Cai, X. (2024). Uncovering historical reservoir operation rules and patterns: Insights from 452 large reservoirs in the contiguous United States. Water Resources Research, 60, e2023WR036686. https://doi.org/10.1029/2023WR036686
  • National Oceanic and Atmospheric Administration (NOAA). (2025). Climate at a Glance: Statewide Time Series. https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/statewide/time-series
  • Steyaert, J. C., Condon, L. E., Turner, W. D., & others. (2022). ResOpsUS, a dataset of historical reservoir operations in the contiguous United States. Scientific Data, 9, 34. https://doi.org/10.1038/s41597-022-01134-7
  • U.S. Army Corps of Engineers (USACE). (2025). WM data dissemination [Dataset]. https://water.usace.arm.mil/overview
  • U.S. Army Corps of Engineers (USACE). (2025). National Inventory of Dams [Dataset], https://nid.sec.usace.army.mil/#/
  • Yadav, A., & Gao, H. (2025). SARAH-CONUS: Sub-weekly area of reservoirs from analysis of harmonized Landsat and Sentinel-2 data for the continental US [Dataset]. Texas Data Repository, V1. https://doi.org/10.18738/T8/4BMYBP
  • Zhao, Q., & Cai, X. (2020). Deriving representative reservoir operation rules using a hidden Markov-decision tree model. Advances in Water Resources, 146, 103753. https://doi.org/10.1016/j.advwatres.2020.103753
  • Zheng, Z., Cai, X., et al. (2025). GDROM v2: A nationwide inventory of operation variable time series and rules for 2,017 large reservoirs across the CONUS (in preparation).

Contact

For any questions, please contact:

Ximing Cai Email: [xmcai@illinois.edu]

Zihan Zheng
Email: [zihanz10@illinois.edu]

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
Cooperative Institute for Research to Operations in Hydrology (CIROH) NOAA Cooperative Institute Program NA22NWS4320003

How to Cite

Zheng, Z., X. Cai, Y. Chen (2025). GDROM v2: An Inventory of Operation Variables Time Series and Rules for 2,017 Large Reservoirs across the CONUS, HydroShare, https://doi.org/10.4211/hs.5293674cb83b4ec698db0eb4777467b8

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required