Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

Evaluating the use of Soil Moisture, January Baseflow, and Snow Water Equivalent storage indicators to enhance Colorado Basin River Forecast Center water supply forecasts


Authors:
Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource.
Type: Resource
Storage: The size of this resource is 32.1 MB
Created: Jul 16, 2025 at 4:59 p.m. (UTC)
Last updated: Nov 17, 2025 at 5:55 a.m. (UTC)
Citation: See how to cite this resource
Content types: Geographic Feature Content 
Sharing Status: Public
Views: 139
Downloads: 106
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

This resource provides the dataset and Python workflows used to evaluate improved water supply forecasting for the Upper Colorado River Basin and the Great Salt Lake Basin areas served by the Colorado Basin River Forecast Center (CBRFC). The study focuses on enhancing April–July runoff volume predictions by explicitly incorporating three key hydrologic storage indicators—January baseflow, soil moisture, and snow water equivalent (SWE)—alongside the official CBRFC Most Probable (MP) water supply forecast. These indicators represent antecedent conditions that help explain variability in spring snowmelt-driven streamflow across snow-dominated watersheds.

Data and Python code used to implement the multiple linear regression (MLR) models, station data processing, and spatial analysis are included here. The research found that combining multiple storage indicators with the CBRFC forecast leads to gains in predictive skill, particularly in headwater basins where natural hydrologic processes are less influenced by regulation. Among the variables evaluated, soil moisture contributed the largest improvements when added to the model.

This resource holds data and code used to compute the results reported in the MS thesis: Morovati, R., (2025), "Evaluating Use Of Multiple Hydrologic Storage Indicators To Enhance Streamflow Forecasting " MS Thesis, Civil and Environmental Engineering, Utah State University.

Subject Keywords

Coverage

Spatial

Coordinate System/Geographic Projection:
WGS 84 EPSG:4326
Coordinate Units:
Decimal degrees
North Latitude
43.4524°
East Longitude
-105.6268°
South Latitude
35.5582°
West Longitude
-111.9149°

Content

readme.md

Last Updated: 11.15.2025

Contact: reza.morovati@usu.edu

This resource contains data, spatial layers, and Python scripts (Jupyter Notebook) used to build and evaluate a Multiple Linear Regression (MLR) model for enhance NOAA Colorado Basin River Forecast Center (CBRFC) February–April water supply forecast across watersheds in the Upper Colorado River Basin and Great Salt Lake Basin.

Overview

This project brings together watershed boundaries, predictor datasets (storage indicators), and model-evaluation tools to construct and analyze MLR-based water-supply forecasts. The notebook guides the user through:

  • Preparing and unzipping the input datasets
  • Loading spatial watershed files (boundary and buffer variants)
  • Building and evaluating multiple linear regression models for each watershed
  • Visualizing spatial and statistical model performance
  • Comparing model accuracy between buffer-based and boundary-based watershed definitions
  • Investigating predictor collinearity to support model refinement

Data Sources

This notebook uses prepared datasets contained in the ZIP files included with the repository. These include:

Watershed Spatial Data

  • Boundary shapefiles for each forecasting point and related watershed
  • 10 km buffer polygons generated around watershed boundaries

Predictor Data

  • Snow Water Equivalent (SWE) data from SNOTEL sites (NRCS)
  • Soil Moisture from both SCAN and SNOTEL sites (NRCS)
  • January Baseflow data from USGS NWIS

Model Result Files

  • MLR output summaries (NSE, KGE) for each watershed
  • Side-by-side comparisons for buffer vs boundary watershed definitions

Code Structure

PythonCodes / Jupyter Scripts

MLR_CBRFC_Water_Supply_Forecast_FEBAPR.ipynb
Jupyter Notebook script performing the full workflow:

  • Unzips input files to correct directories
  • Loads watershed polygons, buffer polygons, and station metadata
  • Imports predictor time series and joins them with station data
  • Builds MLR models to predict Feb–Apr water supply
  • Calculates performance metrics for each watershed
  • Generates spatial model-performance maps
  • Compares buffer vs boundary MLR results
  • Conducts predictor-collinearity analysis
  • Exports summary statistics and figures
    Each code block in this notebook is described below.

Block-by-Block Notebook Guide

Block 1 — Unzipping Input Files

This block extracts all required data files from provided ZIP archives.If this step is skipped, the rest of the notebook cannot run because the data directories will be empty.

Block 2 — Project Description

Load Watersheds, Station Metadata, and Predictor Inputs

The script imports watershed boundaries (both boundary and buffer) along with all station-specific datasets such as SWE, soil moisture, baseflow, and CBRFC CMP data. It filters out stations that lack required data or fall outside the Colorado + Great Salt Lake study area.

Build Multiple Linear Regression (MLR) Models for Each Station

For every station, the code assembles available predictors and runs eight different MLR scenarios, each using different combinations of SWE, soil moisture, baseflow, and CMP. It applies a 70/30 train–test split based on years, trains each model, and generates monthly predictions.

Compute Performance Metrics (Train + Test)

For each model scenario, it calculates skill metrics including NSE, and KGE, both overall and month-by-month. Results are stored separately for boundary and buffer watershed versions.

Save Outputs and Summaries for All Stations

The script exports:

  • Model predictions for each station/scenario
  • A combined summary of all performance metrics
  • A monthly metrics summary
  • Lists of included and excluded stations
  • Counts of how many stations were successfully processed
    This provides a complete dataset for comparing model skill and evaluating the effect of boundary vs buffer watershed definitions.

Block 3 — Loading Spatial Libraries and Watershed Data

  • The script loads all spatial datasets (states, rivers, lakes, Great Salt Lake Basin, Upper Colorado Basin) along with USGS station metadata and the monthly MLR performance results (NSE and KGE).
  • It filters the model results to include only the buffer-based watershed type, selects each scenario and month, and merges the station performance metrics with station coordinates.
  • For every scenario and month, it creates spatial maps that display Kling-Gupta Efficiency (KGE) and Nash-Sutcliffe Efficiency (NSE) as colored point layers over the basins, allowing the user to visually assess how well each model scenario performs at each station.
  • It saves each set of maps to a folder organized by watershed type, producing a complete spatial visualization suite that shows the performance of all model scenarios across all months.

Block 4 — Spatial Visualization of Model Performance

This portion reads in a CSV file containing performance results for each watershed (NSE, KGE). It then joins these numerical results with the watershed polygons and produces:

  • Colored maps of model skill
  • Basin-level comparison figures across months and model configurations
    These maps allow the user to see where the forecasting model performs well and where it struggles.

Block 5 — Performance Visualization Across All Watersheds

  • The script loads the monthly performance metrics for all buffer-based stations and prepares them for visualization, including consistent scenario naming and month labeling.
  • A custom blue-green-red colormap is created to show model skill, where darker blues represent stronger performance and reds indicate weaker performance.
  • For each scenario and each spring month, the script builds grouped box-and-dot plots that display the full distribution of KGE and NSE values across stations, allowing direct comparison of how each predictor combination behaves through time.
  • The resulting figures summarize model stability and variability, helping highlight which scenarios consistently perform well and which are more sensitive to month-to-month changes.

Block 6 — Comparing Buffer vs Boundary Watersheds

This section loads two different MLR result files:

  1. Boundary-based results
  2. Buffer-based results

Block 7 — Collinearity Analysis of Predictors

This block inspects whether the predictors used in the MLR models are highly correlated with each other.

Steps include:

  • Loading all predictor datasets
  • Merging predictor tables for each station into a single combined dataset
  • Calculating a correlation matrix
  • Plotting a heatmap showing predictor relationships
  • Identifying redundant variables that may degrade model performance

This helps clean and refine the predictor set to improve the regression model’s stability and accuracy.

Data Services

The following web services are available for data contained in this resource. Geospatial Feature and Raster data are made available via Open Geospatial Consortium Web Services. The provided links can be copied and pasted into GIS software to access these data. Multidimensional NetCDF data are made available via a THREDDS Data Server using remote data access protocols such as OPeNDAP. Other data services may be made available in the future to support additional data types.

Related Resources

This resource is described by Morovati, R., (2025), "Evaluating Use Of Multiple Hydrologic Storage Indicators To Enhance Streamflow Forecasting " MS Thesis, Civil and Environmental Engineering, Utah State University.

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
National Science Foundation HDR Institute: Geospatial Understanding through an Integrative Discovery Environment 2118329
Utah Water Research Laboratory Graduate Research Assistantship

How to Cite

Morovati, R., D. Tarboton (2025). Evaluating the use of Soil Moisture, January Baseflow, and Snow Water Equivalent storage indicators to enhance Colorado Basin River Forecast Center water supply forecasts, HydroShare, http://www.hydroshare.org/resource/83c1d73697cc461c8de6283f65b57498

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required