Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

This resource contains some files/folders that have non-preferred characters in their name. Show non-conforming files/folders.

This resource contains content types with files that need to be updated to match with metadata changes. Show content type files that need updating.

Evaluating the use of Soil Moisture, January Baseflow, and Snow Water Equivalent storage indicators to enhance Colorado Basin River Forecast Center water supply forecasts

Authors:
Owners:		This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource.
Type:	Resource
Storage:	The size of this resource is 32.1 MB
Created:	Jul 16, 2025 at 4:59 p.m. (UTC)
Last updated:	Nov 17, 2025 at 5:55 a.m. (UTC)
Citation:	See how to cite this resource
Content types:	Geographic Feature Content

Sharing Status:	Public
Views:	473
Downloads:	133
+1 Votes:	Be the first one to this.
Comments:	No comments (yet)

Abstract

This resource provides the dataset and Python workflows used to evaluate improved water supply forecasting for the Upper Colorado River Basin and the Great Salt Lake Basin areas served by the Colorado Basin River Forecast Center (CBRFC). The study focuses on enhancing April–July runoff volume predictions by explicitly incorporating three key hydrologic storage indicators—January baseflow, soil moisture, and snow water equivalent (SWE)—alongside the official CBRFC Most Probable (MP) water supply forecast. These indicators represent antecedent conditions that help explain variability in spring snowmelt-driven streamflow across snow-dominated watersheds.

Data and Python code used to implement the multiple linear regression (MLR) models, station data processing, and spatial analysis are included here. The research found that combining multiple storage indicators with the CBRFC forecast leads to gains in predictive skill, particularly in headwater basins where natural hydrologic processes are less influenced by regulation. Among the variables evaluated, soil moisture contributed the largest improvements when added to the model.

This resource holds data and code used to compute the results reported in the MS thesis: Morovati, R., (2025), "Evaluating Use Of Multiple Hydrologic Storage Indicators To Enhance Streamflow Forecasting " MS Thesis, Civil and Environmental Engineering, Utah State University.

Subject Keywords

Deleting all keywords will set the resource sharing status to private.

Coverage

Spatial

Coordinate System/Geographic Projection:

WGS 84 EPSG:4326

Coordinate Units:

Decimal degrees

North Latitude

43.4524°

East Longitude

-105.6268°

South Latitude

35.5582°

West Longitude

-111.9149°

Content

Learn more about the BagIt download

Select a file to see file type metadata.

readme.md

Last Updated: 11.15.2025

Contact: reza.morovati@usu.edu

This resource contains data, spatial layers, and Python scripts (Jupyter Notebook) used to build and evaluate a Multiple Linear Regression (MLR) model for enhance NOAA Colorado Basin River Forecast Center (CBRFC) February–April water supply forecast across watersheds in the Upper Colorado River Basin and Great Salt Lake Basin.

Overview

This project brings together watershed boundaries, predictor datasets (storage indicators), and model-evaluation tools to construct and analyze MLR-based water-supply forecasts. The notebook guides the user through:

Preparing and unzipping the input datasets
Loading spatial watershed files (boundary and buffer variants)
Building and evaluating multiple linear regression models for each watershed
Visualizing spatial and statistical model performance
Comparing model accuracy between buffer-based and boundary-based watershed definitions
Investigating predictor collinearity to support model refinement

Data Sources

This notebook uses prepared datasets contained in the ZIP files included with the repository. These include:

Watershed Spatial Data

Boundary shapefiles for each forecasting point and related watershed
10 km buffer polygons generated around watershed boundaries

Predictor Data

Snow Water Equivalent (SWE) data from SNOTEL sites (NRCS)
Soil Moisture from both SCAN and SNOTEL sites (NRCS)
January Baseflow data from USGS NWIS

Model Result Files

MLR output summaries (NSE, KGE) for each watershed
Side-by-side comparisons for buffer vs boundary watershed definitions

Code Structure

PythonCodes / Jupyter Scripts

MLR_CBRFC_Water_Supply_Forecast_FEBAPR.ipynb
Jupyter Notebook script performing the full workflow:

Unzips input files to correct directories
Loads watershed polygons, buffer polygons, and station metadata
Imports predictor time series and joins them with station data
Builds MLR models to predict Feb–Apr water supply
Calculates performance metrics for each watershed
Generates spatial model-performance maps
Compares buffer vs boundary MLR results
Conducts predictor-collinearity analysis
Exports summary statistics and figures
Each code block in this notebook is described below.

Block-by-Block Notebook Guide

Block 1 — Unzipping Input Files

This block extracts all required data files from provided ZIP archives.If this step is skipped, the rest of the notebook cannot run because the data directories will be empty.

Block 2 — Project Description

Load Watersheds, Station Metadata, and Predictor Inputs

The script imports watershed boundaries (both boundary and buffer) along with all station-specific datasets such as SWE, soil moisture, baseflow, and CBRFC CMP data. It filters out stations that lack required data or fall outside the Colorado + Great Salt Lake study area.

Build Multiple Linear Regression (MLR) Models for Each Station

For every station, the code assembles available predictors and runs eight different MLR scenarios, each using different combinations of SWE, soil moisture, baseflow, and CMP. It applies a 70/30 train–test split based on years, trains each model, and generates monthly predictions.

Compute Performance Metrics (Train + Test)

For each model scenario, it calculates skill metrics including NSE, and KGE, both overall and month-by-month. Results are stored separately for boundary and buffer watershed versions.

Save Outputs and Summaries for All Stations

The script exports:

Model predictions for each station/scenario
A combined summary of all performance metrics
A monthly metrics summary
Lists of included and excluded stations
Counts of how many stations were successfully processed
This provides a complete dataset for comparing model skill and evaluating the effect of boundary vs buffer watershed definitions.

Block 3 — Loading Spatial Libraries and Watershed Data

The script loads all spatial datasets (states, rivers, lakes, Great Salt Lake Basin, Upper Colorado Basin) along with USGS station metadata and the monthly MLR performance results (NSE and KGE).
It filters the model results to include only the buffer-based watershed type, selects each scenario and month, and merges the station performance metrics with station coordinates.
For every scenario and month, it creates spatial maps that display Kling-Gupta Efficiency (KGE) and Nash-Sutcliffe Efficiency (NSE) as colored point layers over the basins, allowing the user to visually assess how well each model scenario performs at each station.
It saves each set of maps to a folder organized by watershed type, producing a complete spatial visualization suite that shows the performance of all model scenarios across all months.

Block 4 — Spatial Visualization of Model Performance

This portion reads in a CSV file containing performance results for each watershed (NSE, KGE). It then joins these numerical results with the watershed polygons and produces:

Colored maps of model skill
Basin-level comparison figures across months and model configurations
These maps allow the user to see where the forecasting model performs well and where it struggles.

Block 5 — Performance Visualization Across All Watersheds

The script loads the monthly performance metrics for all buffer-based stations and prepares them for visualization, including consistent scenario naming and month labeling.
A custom blue-green-red colormap is created to show model skill, where darker blues represent stronger performance and reds indicate weaker performance.
For each scenario and each spring month, the script builds grouped box-and-dot plots that display the full distribution of KGE and NSE values across stations, allowing direct comparison of how each predictor combination behaves through time.
The resulting figures summarize model stability and variability, helping highlight which scenarios consistently perform well and which are more sensitive to month-to-month changes.

Block 6 — Comparing Buffer vs Boundary Watersheds

This section loads two different MLR result files:

Boundary-based results
Buffer-based results

Block 7 — Collinearity Analysis of Predictors

This block inspects whether the predictors used in the MLR models are highly correlated with each other.

Steps include:

Loading all predictor datasets
Merging predictor tables for each station into a single combined dataset
Calculating a correlation matrix
Plotting a heatmap showing predictor relationships
Identifying redundant variables that may degrade model performance

This helps clean and refine the predictor set to improve the regression model’s stability and accuracy.

Data Services

The following web services are available for data contained in this resource. Geospatial Feature and Raster data are made available via Open Geospatial Consortium Web Services. The provided links can be copied and pasted into GIS software to access these data. Multidimensional NetCDF data are made available via a THREDDS Data Server using remote data access protocols such as OPeNDAP. Other data services may be made available in the future to support additional data types.

Web Map Service

https://geoserver.hydroshare.org/geoserver/HS-83c1d73697cc461c8de6283f65b57498/wms?request=GetCapabilities

Web Feature Service

https://geoserver.hydroshare.org/geoserver/HS-83c1d73697cc461c8de6283f65b57498/wfs?request=GetCapabilities

Related Resources

This resource is described by

Morovati, R., (2025), "Evaluating Use Of Multiple Hydrologic Storage Indicators To Enhance Streamflow Forecasting " MS Thesis, Civil and Environmental Engineering, Utah State University.

Credits

Funding Agencies

This resource was created using funding from the following sources:

Agency Name	Award Title	Award Number
National Science Foundation	HDR Institute: Geospatial Understanding through an Integrative Discovery Environment	2118329
Utah Water Research Laboratory	Graduate Research Assistantship	None

How to Cite

Morovati, R., D. Tarboton (2025). Evaluating the use of Soil Moisture, January Baseflow, and Snow Water Equivalent storage indicators to enhance Colorado Basin River Forecast Center water supply forecasts, HydroShare, http://www.hydroshare.org/resource/83c1d73697cc461c8de6283f65b57498

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/

Comments

There are currently no comments

Notifications (${tasks.length})