Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

Codes and Results to Prioritize Wells for Groundwater Monitoring in the Arkansas River Headwaters Basin, Colorado, USA


Authors:
Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) to determine if accessing this resource is possible.
Type: Resource
Storage: The size of this resource is 127.1 MB
Created: Jun 20, 2024 at 9:18 p.m. (UTC)
Last updated: Jun 30, 2025 at 7:52 p.m. (UTC)
Citation: See how to cite this resource
Content types: Geographic Feature Content  CSV Content 
Sharing Status: Discoverable (Accessible via direct link sharing)
Views: 993
Downloads: 17
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

This HydroShare resource provides supporting information for Fahrney, E.E.; Mays, D.C.; Newman, C.P. (2025), Systematic approach to prioritize wells for groundwater monitoring in the Arkansas Headwaters Basin, Colorado, USA, Journal of Hydrology—Regional Studies.

Subject Keywords

Coverage

Spatial

Coordinate System/Geographic Projection:
WGS 84 EPSG:4326
Coordinate Units:
Decimal degrees
Place/Area Name:
Arkansas Headwaters Basin
North Latitude
39.3867°
East Longitude
-105.8879°
South Latitude
38.3314°
West Longitude
-106.5841°

Content

readme.txt

README.TXT
eef 6/30/2025

This HydroShare resource provides supporting information for Fahrney, E.E.; Mays, D.C.; Newman, C.P. (2025), Systematic approach to prioritize wells for groundwater monitoring in the Arkansas River Headwaters Basin, Colorado, USA, Journal of Hydrology—Regional Studies, https://doi.org/TBD. 

CONTENTS

1. R Script Tutorial 
2. R Script Files
3. Results
4. References

SECTION 1. R SCRIPT TUTORIAL 

1.a. Tutorial

Software Download

1. Download R

We used https://mirror.las.iastate.edu/CRAN/ from Iowa State University, Ames, IA, USA, download for Windows. They recommend using the Comprehensive R Archive Network mirror closest to you: https://cran.r-project.org/mirrors.html. We used R 4.4.0 which can be found here if there are malfunctions with a current version of R: https://cran.r-project.org/bin/windows/base/old/

2. Download RStudio: https://posit.co/products/open-source/rstudio/

Folder Set-Up

1. On your computer or in cloud storage where 91 GB are available, create a folder to save the files below. In this tutorial, we will call that folder "PrioritizeWells".

2. From the R Script Files folder, download "WellPrioritization_Rscript_ArkansasHeadwaters.R" and save it to the PrioritizeWells folder.

3. In the PrioritizeWells folder, create a subfolder called “inputs” (case sensitive).

4. The Inputs FOLDER in the HydroShare with files described in the section with the same name below contains the files to copy into your inputs folder. 

5. Output folders are created by the script.

Run WellPrioritization_Rscript_ArkansasHeadwaters.R 

1. We recommend first creating a new project in RStudio. This step is not required. Open RStudio. In the top right corner, click the dropdown arrow. Click New Project > Choose Folder. Choose your PrioritizeWells folder. If you created your PrioritizeWells folder in cloud storage but cannot access that folder through the Choose Folder browsing in R Studio, you will need to set up the PrioritizeWells folder in a different location. 

2. In RStudio, you should see 4 windows: the script, the environment/history, the files/plots, and the console. In the files/plots, choose the Files tab. If you did not start a new project, you can simply go to File>Open.

3. You should see the files you have added to the PrioritizeWells folder. Open WellPrioritization_Rscript_ArkansasHeadwaters.R.

5. In the script window, hold Control-A to select all. Press Control-Enter to run WellPrioritization_Rscript_ArkansasHeadwaters.R.

6. Your PrioritizeWells output folder should populate with files as WellPrioritization_Rscript_ArkansasHeadwaters.R runs.

Your Input

7. WellPrioritization_Rscript_ArkansasHeadwaters.R will automatically ask for your input in several sections.

(Sections are designated by ####Section Description------------------####. They can be quickly accessed through the drop-down menu in the bottom left of the main scripting window.) 

Near the beginning of the script, user input prompts will ask you to do the following:
Set the output folder name.We typically used the format yyyy_mmdd_Trialx.

Set whether you want to pull current data from the USGS database (by typing “1”) or duplicate our research by pulling records downloaded in February 2022 (by typing “0”). 

Near the end of the script, user prompts will ask you to do the following:
Set the total number of wells funded including those actively monitored. Our number was 42. 

Choose whether or not to plot regressions for additional pairs not saved to the RegressionPlots folder. If desired, type YES. You will be prompted to enter the two letters for each well. Note these are case sensitive, so they must be in all capital letters.

After Your Initial Run of the Script

1. You can select “MAKE ADDITIONAL PLOTS if desired” section and press Control-Enter to plot additional regressions.

2. If you run from the beginning again or change WellPrioritization_Rscript_ArkansasHeadwaters.R, consider creating a new folder for outputs with a new trial number.

SECTION 2. R SCRIPT FILES

Number of files: 2 (including one subfolder).

WellPrioritization_Rscript_ArkansasHeadwaters.R is the R script file described above.

Inputs FOLDER

Number of files: 7 (including one subfolder)

File name: 11020001_USGSgw_elevs_2024_0227.csv
File description: Table of groundwater elevations for hydrologic unit code 11020001 from 02/27/2024 data retrieval
Source: U.S. Geological Survey (USGS), 2024, USGS water data for the Nation: U.S. Geological Survey National Water Information System database at https://doi.org/10.5066/F7P55KJN. 
Records: 16296 records
Column details (more information can be found at https://help.waterdata.usgs.gov/codes-and-parameters):
1. agency_cd: agency data source (USGS for present study)
2. site_no: site number
3. site_tp_cd: code for type of site (GW for present study)
4. lev_dt: date of groundwater elevation measurement
5. lev_tm: time of groundwater elevation measurement
6. lev_tz_cd_reported: time zone of time of groundwater elevation measurement time
7. lev_va: depth to water measured
8. sl_lev_va: groundwater level elevation 
9. sl_datum_cd: groundwater elevation measurement datum
10. lev_status_cd: code for status of groundwater elevation measurement (1 for static levels for present study)
11. lev_agency_cd: agency that made the measurement
12. lev_dt_acy_cd: code for the degree of precision for date/time of the groundwater elevation measurement
13. lev_acy_cd: code for accuracy of groundwater elevation measurement
14. lev_src_cd: code for the source of the groundwater elevation measurement, such as owner of well, driller’s log, university associate, etc.
15. lev_meth_cd: method for groundwater elevation measurement (S=steel-tape, V=calibrated electric-tape measurement, T=electric-tape measurement, Z=other)
16. lev_age_cd: code for the water-level approval status
17. parameter_cd: USGS code for measured parameter (72019=depth to water; 62610=groundwater level above NGVD 1929, ft; 62611=groundwater level above NGVD 1988, ft)
18. lev_dateTime: date and time of groundwater elevation measurement
19. lev_tz_cd: time zone for date and time of groundwater elevation measurement

File name: 11020001_USGSgw_site_info_2024_0227.csv
File description: Table of sites in hydrologic unit code 11020001 with groundwater elevation measurements from 02/27/2024 data retrieval
Source: U.S. Geological Survey (USGS), 2024, USGS water data for the Nation: U.S. Geological Survey National Water Information System database at https://doi.org/10.5066/F7P55KJN. 
Records: 228 records
Column details:
1. site_no: site number 
2. agency_cd: agency data source (USGS for present study)
3. station_nm: additional station identifying number / code
4. site_tp_cd: code for type of site (GW for present study)
5. dec_lat_va: latitude in decimal degrees
6. dec_long_va: longitude in decimal degrees
7. coord_acy_cd: code for accuracy of latitude and longitude coordinates
8. dec_coord_datum_cd: Coordinate datum code
9. alt_va: ground surface elevation of site
10. alt_acy_va: accuracy of ground surface elevation of site
11. alt_datum_cd: datum for ground surface elevation
12. huc_cd: hydrologic unit code (11020001 for present study)
13. data_type_cd: code for type of data (GW for present study)
14. parameter_cd: USGS code for measured parameter (72019=depth to water; 62610=groundwater level above NGVD 1929, ft; 62611=groundwater level above NGVD 1988, ft)
15. stat_cd: immaterial
16. ts_id: immaterial
17. loc_web_ds: immaterial
18. medium_grp_cd: immaterial
19. parm_grp_cd: immaterial
20. srs_id: immaterial
21. access_cd: immaterial
22. begin_date: first date of groundwater elevation measurements at that site
23. end_date: last date of groundwater elevation measurements at that site
24. count_nu: immaterial
25. parameter_group_nm: parameter group name (Physical for present study)
26. parameter_nm: parameter name (Depth to water level, feet below land surface for present study)
27. casrn: immaterial
28. srsname: parameter name (Depth to water level below land surface for present study)
29. parameter_units: units for groundwater elevation measurement (ft for present study)
30. nat_aqfr_cd: Aquifer code based on national aquifer coding system
31. aqfr_cd: aquifer code of well completion
32. aqfr_type_cd: code for aquifer type
33. well_depth_va: depth of well completion
34. hole_depth_va: total depth of hole
35. num_records: number of groundwater elevation measurements at that site (added by R script)
36. record_length_years: number of years in period of record of groundwater elevations for that site (added by R script)

File name: aquifer_codes.csv
File description: Table of aquifer codes and descriptions
Source: USGS, 2023, email with Connor Newman
Records: 189 records
Column details:
1. aqfr_cd: Code for aquifer
2. Aquifer: Aquifer name

File name: ChaffeeCounty_tritium_data.csv
File description: Table of tritium results for wells in Chaffee County
Source: USGS, 2023, email with Connor Newman
Records: 27 records
Column details:
1. site_no: USGS site number
2. SAMPLE_START_DT: date of sample
3. PARM_NM: USGS parameter name, “Tritium, wu” is listed for all, where “wu” indicates water, unfiltered
4. result_pCi/L: tritium activity in pico-Curies/Liter

File name: PCA_summary_results.csv
File description: Table of water quality PCA results for wells in Chaffee County
Source: USGS, 2023, email with Connor Newman
Records: 23 records
Column details:
1. site_ID: USGS site number
2. alias: abbreviated USGS ID
3. group: from principal component analysis
4. groupPair: from principal component analysis
5. PC1: principal component 1
6. PC2: principal component 2
7. PC3: principal component 3
8. abs_sum: absolute value of PC1+PC2+PC3

File name: USGSactiveSites.csv
File description: Table of actively monitored sites
Source: USGS, 2024, email with Connor Newman
Records: 12 records
Column details:
1. site_no: USGS site number
2. site_ID: expanded USGS site ID
3. alias: abbreviated USGS ID
4. station_nm: USGS station name
5. site_type: type of USGS monitoring site, GW for groundwater is listed for all
6. lat: latitude
7. long: longitude
8. ActivelyMonitored: YES indicates well is actively monitored

GIS_shape_files SUBFOLDER

This folder contains eight subfolders, each of which contain a variable number of files with various extensions that constitute a shapefile (for example, AllArkRiversChaffeeProject has the eight extensions .cpg, .dbf, .prj, .sbn, .sbx, .shp, .xml, and .shx). The extension .shp is the primary shapefile opened in GIS products. In addition,  the accompanying files are also needed. The column headings are not visible from the HydroShare interface but will be visible when shapefile is opened in appropriate interface (i.e. R or with an attribute table in the GIS interface). WellPrioritization_Rscript_ArkansasHeadwaters.R reads these files with the following syntax: st_read(dsn='inputs/GIS/folder_name', 'file_name'). European Petroleum Survey Group (EPSG) geodetic codes are included with the coordinate reference systems. Descriptions are not given for immaterial data columns.

Folder name: AllArkRiversChaffee26913
Root name of files: AllArkRiversChaffeeProject
Folder description: Lines for Arkansas River and South Arkansas River
Source: U.S. Geological Survey (USGS), 2023b. National Hydrography Dataset [WWW Document]. The National Map, United States Geological Survey. URL https://apps.nationalmap.gov/downloader/#/ (accessed 1.20.24).
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Column details:
1. permanent_: immaterial
2. fdate: immaterial
3. resolution: immaterial
4. gnis_id: immaterial
5. gnis_name: River name
6. lengthkm: Length of river stretch in kilometers
7. reachcode: immaterial
8. flowdir: immaterial
9. wbarea_per: immaterial
10. ftype: immaterial
11. fcode: immaterial
12. mainpatch: immaterial
13. innetwork: immaterial
14. visibility: immaterial
15. SHAPE_Leng: immaterial
16. Enables: immaterial
17. ObjectID: immaterial

Additional descriptions can be found in Moore et al., 2019, User's guide for the national hydrography dataset plus (NHD Plus) high resolution, Open-File Report 2019-1096, US Department of the Interior, US Geological Survey, https://doi.org/10.3133/ofr20191096.

Folder name: cb_2018_us_state_20m
Root name of files: cb_2018_us_state_20m
Folder description: Polygons of boundaries of states in the United States of America
Source: U.S. Census, 2024. States: cb_2018_us_state_20m.zip [WWW Document]. Cartographic Boundary Shapefiles, US Census. URL https://www.census.gov/geographies/mapping-files/2018/geo/carto-boundary-file.html (accessed 1.20.24).
Coordinate Reference System: North American Datum 1983, EPSG 4269
Column details:
1. STATESFP: immaterial
2. STATENS: immaterial
3. AFFGEOID: immaterial
4. GEOID: immaterial
5. STUSPS: immaterial
6. NAME: State name
7. LSAD: immaterial
8. ALAND: immaterial
9. AWATER: immaterial

Folder name: Colorado_City_Boundaries26913
Root name of files: Colorado_City_Boundaries_Project
Folder description: Polygons of municipalities in the state of Colorado
Source: Colorado Department of Public Health and Environment, 2023a. Colorado City Boundaries [WWW Document]. CDPHE Open Data, Colorado Department of Public Health and Environment. URL https://data-cdphe.opendata.arcgis.com/datasets/CDPHE::colorado-city-boundaries/about (accessed 10.22.23).
Coordinate reference system: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Column details: 
1. OBJECTID: Numbers 1 through 458 for each row
2. GEOID10: Seven digit code specific to that polygon
3. NAME10: Name of municipality
4. NAMELSAD10: Name of municipality followed by designation as city or town

Folder name: Colorado_County_Boundaries
Root name of files: Colorado_County_Boundaries
Folder description: Polygons of county boundaries in the state of Colorado
Source: Colorado Department of Public Health and Environment, 2023b. Colorado County Boundaries [WWW Document]. CDPHE Open Data, Colorado Department of Public Health and Environment. URL https://data-cdphe.opendata.arcgis.com/datasets/CDPHE::colorado-county-boundaries/about (accessed 10.22.23).
Coordinate Reference System: North American Datum 1983, EPSG 4269
Column details:
1. OBJECTID: Numbers 1 through 64 for each row
2. COUNTY: County shortened name in all capital letters
3. FULL: Full name including the word "County"
4. LABEL: County shortened name
5. CNTY_FIPS: Federal Information Processing Standards county code
6. NUM_FIPS: Federal Information Processing Standards United States code
7. CENT_LAT: Latitude of polygon centroid
8. CENT_LONG: Longitude of polygon centroid
9. US_FIPS: Federal Information Processing Standards United States code

Folder name: Geology
Root name of files: ChaffeeGeology
Folder description: Polygons of boundaries of geologic units in Study Area formed by spatial join of geologic map with Colorado water district 11
Sources: Stoeser, D.B., Green, G.N., Morath, L.C., Heran, W.D., Wilson, A.B., Moore, D.W., Van Gosen, B.S., 2005. Preliminary integrated geologic map databases for the United States Central States: Montana, Wyoming, Colorado, New Mexico, Kansas, Oklahoma, Texas, Missouri, Arkansas, and Louisiana, - the State of Colorado. Open-File Report 2005-1351.
Tweto, O., 1979. Geologic Map of Colorado:  U.S. Geological Survey Special Geologic Map, scale 1:500,000.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Column details:
1. OBJECTID: immaterial
2. DISTRICT: All geologic units are in Colorado water district 11
3. DIVISION: All geologic units are in Colorado water division 2
4. BASIN: All geologic units are in the Arkansas basin
5. NAME: All geologic units are in the basin named Arkansas-Headwaters to Salida
6. SHAPE_Leng: Length of District 11 in square meters
7. SHAPE_Area: Area of District 11 in square meters
8. AREA: Area of geologic unit
9. PERIMETER: Perimeter of geologic unit
10. COGEOL_DD_: immaterial
11. COGEOL_DD1: immaterial
12. ORIG_LABEL: Geologic unit abbreviation
13. SMGC_LABEL: immaterial
14. UNIT_LINK: immaterial
15. SOURCE: A coded reference citation indicating source material used (see co.txt).
16. UNIT_AGE: The geologic age from the source map used.
17. ROCKTYPE1: The predominant lithology found in the formation.
18. ROCKTYPE2: The second most predominant lithology in the formation.

Folder name: HUC_list
Root name of files: huc250k
Folder description: United States Geological Survey hydrologic unit polygons at 1:250,000 scale designated with 8 digits (HUC-8)
Source: U.S. Geological Survey (USGS), 2023a. Digital Spatial Data Sets: 1:250,000-scale Hydrologic Units (huc250k) [WWW Document]. Hydrologic Unit Maps, Water Resources of the United States, United States Geological Survey. URL https://water.usgs.gov/GIS/huc.html (accessed 8.14.23).
Coordinate Reference System: North American Datum 1927 Albers, EPSG 6267
Column details:
1. AREA: hydrologic unit polygon area in square meters
2. PERIMETER: hydrologic unit perimeter in meters
3. HUC250K_: internal identification numbers
4. HUC250K_ID: internal identification numbers
5. HUC_CODE: eight digit code assigned to hydrologic unit
6. HUC_NAME: name of hydrologic unit
7. REG: region 2-digit code for hydrologic unit
8. SUB: subregion 4-digit code for hydrologic unit
9. ACC: 6-digit code for hydrologic unit
10. CAT: 8-digit code for hydrologic unit

Folder name: TwetoGeology
Root name of shapefiles: cogeol_dd_polygon
Folder description: Polygons of boundaries of geologic units in Colorado and associated metadata
Source: USGS, 2024, email with Connor Newman (Tweto, O., 1979. Geologic Map of Colorado:  U.S. Geological Survey Special Geologic Map, scale 1:500,000.)
Coordinate Reference System: North American Datum 1927, EPSG 4267

File name: co.txt
File description: Text metadata for TwetoGeology folder

File name: cogeol_dd_polygon.shp
File description: Polygons of boundaries of geologic units in Colorado 
Records: 7335 records
Column details:
1. AREA:  area of geologic unit in square degrees
2. PERIMETER: perimeter of geologic unit in degrees
3. COGEOL_DD_: immaterial
4. COGEOL_DD1: immaterial
5. ORIG_LABEL: Geologic unit abbreviation  
6. SGMC_LABEL: immaterial  
7. UNIT_LINK: immaterial  
8. SOURCE: A coded reference citation indicating source material used (see co.txt)    
9. UNIT_AGE: geologic age from the source map used
10. ROCKTYPE1: predominant lithology found in geologic unit  
11. ROCKTYPE2: second most predominant lithology found in geologic unit  
12. geometry: vector of polygon UTM coordinates (Zone 13) for use in sf package in R 

File name: COunits.csv
File description: Description of geologic units in shapefile above
Records: 185 records
Column details:
1. STATE: two letter abbreviation for mapped state in United States (CO in present study) 
2. ORIG_LABEL: Geologic unit abbreviation  
3. MAP_SYM1: Geologic unit abbreviation  
4. MAP_SYMB2: Geologic unit abbreviation  
5. UNIT_LINK: Geologic unit abbreviation  
6. PROV_NO: immaterial  
7. PROVINCE: immaterial  
8. UNIT_NAME: name of geologic unit  
9. UNIT_AGE: geologic age from the source map used  
10. UNITDESC: description of geologic unit  
11. STRAT_UNIT: blank 
12. UNIT_COM: blank  
13. MAP_REF: immaterial  
14. ROCKTYPE1: predominant lithology found in geologic unit   
15. ROCKTYPE2: second most predominant lithology found in geologic unit   
16. ROCKTYPE3: third most predominant lithology found in geologic unit   
17. UNIT_REF: immaterial  

Folder name: Water_Districts
Root name of files: Water_Districts
Folder description: Polygons of Division of Water Resources water districts in the state of Colorado
Source: Colorado’s Decision Support Systems, 2023b. GIS Data by Category: District Boundaries [WWW Document]. Colorado’s Decision Support Systems, Colorado Water Conservation Board / Division of Water Resources. URL https://cdss.colorado.gov/gis-data/gis-data-by-category (accessed 8.14.23).
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Column details:
1. OBJECTID: Numbers 1 through 83 for each row
2. DISTRICT: Colorado water district number (1 through 80)
3. DIVISION: Colorado water division number (1 through 7)
4. BASIN: Colorado watershed name 
5. NAME: water district name
6. SHAPE_Leng: length of polygon in square meters
7. SHAPE_Area: area of polygon in square meters

Folder name: WaterDivisions
Root name of files: WtrDivs
Folder description: Polygons of Division of Water Resources water districts in the state of Colorado
Source: Colorado’s Decision Support Systems, 2023. GIS Data by Category: Division Boundaries [WWW Document]. Colorado’s Decision Support Systems, Colorado Water Conservation Board / Division of Water Resources. URL https://cdss.colorado.gov/gis-data/gis-data-by-category (accessed 8.14.23).
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Column details:
1. DIV: numbers 1 through 7 for each row
2. BASIN: Colorado watershed name 
3. Perimeter: polygon perimeter
4. Area: polygon area in square meters
5. Acres: polygon area in acres
6. Area_mi2: polygon area in square miles

SECTION 3. RESULTS

Number of files: 0 files, 8 subfolders

a.DataBySite FOLDER

Number of files: 126
Description: full groundwater elevation data for each well
File naming structure: "Data_AA.csv" where, for example, AA is the site code.
Records: Numbered 1 through length of record for that site.
Column details: 
1. Date: Year and month of groundwater level measurement
2. site_no: USGS site number
3. gwElev: groundwater elevation in meters
4. aqfr_cd: aquifer code if available
5. depth: well depth in meters
6. Aquifer: aquifer name, if available
7. Site: two-letter site code for present study
8. Mean: average groundwater elevation in meters
9. Deviation: groundwater elevation deviation from the mean in meters

b.MapsAndWellCodes FOLDER

Number of files: 8
File name: AllSites.csv
Description: table of USGS site codes, present study aliases, and UTM coordinates for full network
Records: numbers 1-126
Column details: 
1. Site: two-letter code assigned to site for present study
2. site_no: USGS site code
3. x_coord: UTM X coordinate of site (Zone 13)
4. y_coord: UTM Y coordinate of site (Zone 13)
5. geometry: site UTM coordinates (Zone 13) for use in sf package in R 

File name: MapAllSites.kml
Description: GIS-ready map of study area with sites labeled per csv file above
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Records: 126
Column details:
1. Site: two-letter code assigned to site for present study
2. site_no: USGS site code
3. x_coord: UTM X coordinate of site (Zone 13)
4. y_coord: UTM Y coordinate of site (Zone 13)
5. geometry: site UTM coordinates (Zone 13) for use in sf package in R 

File name: MapAllSites.pdf
Description: map of study area with sites labeled per csv file above
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

File name: MapBatches.kml
Description: GIS-ready map of study area with labeled well batch delineations (see text section 2.3.1)
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Records: 21
Column details:
1. id: batch identification number
2. geometry: vector of batch UTM coordinates (Zone 13) for use in sf package in R 
3. Area_m: area of batch outline in meters
4. Area_km: area of batch outline in kilometers

File name: MapBatches.pdf
Description: map of study area with labeled well batch delineations (see text section 2.3.1)
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

File name: MapBatchesWithSites.pdf
Description: map of study area well batch delineations with labeled sites (see text section 2.3.1)
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

File name: MapGeologyTweto.kml
Description: GIS-ready map of geologic formations in which sites lie including site labels
Source: Stoeser et al., 2005; Tweto, 1979. See bibliography of R packages used and map data sources in section 4. REFERENCES below.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Records: 287
Column details:
1. Unit: abbreviation for geologic unit (see geology table in text)
2. geometry: vector of polygon UTM coordinates (Zone 13) for use in sf package in R 

File name: MapSitesGeologywithLabels.pdf
Description: map of geologic formations in which sites lie including site labels
Source: Stoeser et al., 2005; Tweto, 1979. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

c.DataPaired FOLDER

Number of files: 6098
Description: groundwater elevation measurements for pairs of wells with overlapping dates
File naming structure: "AA_AC_DataPaired.csv" where AA and AC are the site codes. Only one file exists for each pair. For instance, because "AA_AC_DataPaired.csv" exists, the complimentary file "AC_AA_DataPaired.csv" does not exist.
Records: Row numbers were created during the pairing in the script and are immaterial.
Column details:
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteYSubtotal: Subtotal for well Y
5. Date: year and month of groundwater elevation measurement
6. Dist: distance between wells in kilometers
7. SiteXgwElev: groundwater elevations in meters for well X
8. MeanX: average groundwater elevation for well X in meters based on the full dataset
9. DeviationX: groundwater elevation deviation from the mean for well X in meters 
10. SiteYgwElev: groundwater elevations in meters for well Y
11. MeanY: average groundwater elevation for well Y in meters based on the full dataset
12. DeviationY: groundwater elevation deviation from the mean for well Y in meters 

d.RegressionTables FOLDER

Number of files: 8

File name: RsAllMatrix.csv
Description: matrix of R-squared values between wells
Records: All sites AA through FC
Column details: 
Columns are all sites AA through FC

File name: RsAllSites.csv
Description: table summarizing regression analysis and well variables to compare
Records: numbers 1 through 9858
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. slope: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. R: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model
23. SiteXTotal: Total for well X equal to Subtotal plus s(W) (score of weighted degree)
24. SiteYTotal: Total for well Y equal to Subtotal plus s(W) (score of weighted degree)
25. SiteX_SumR2: sum of the R-squared values for well X used to prioritize wells in cases of equal Total
26. SiteY_SumR2: sum of the R-squared values for well  used to prioritize wells in cases of equal Total

File name: RsNegativelyCorrelated.csv 
Description: table summarizing regression analysis for pairs filtered as shown in Table 5. Where Well X Subtotal = Well Y Subtotal, the pair is represented twice. Plots of pairs are shown in HydroShare Folder HydrographsAndRegressionPlots/4.NegativelyCorrelated. 
Records: numbers 1 through 39
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. m: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. PearsonR: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model

File name: RsNonProximalCorrelatedContrasting.csv 
Description: table summarizing regression analysis for pairs filtered as shown in Table 5. Where Well X Subtotal = Well Y Subtotal, the pair is represented twice. Plots of pairs are shown in HydroShare Folder HydrographsAndRegressionPlots/5.NonProximalCorrelatedContrasting. 
Records: numbers 1 through 174
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. m: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. PearsonR: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model

File name: RsProximalCorrelated.csv 
Description: table summarizing regression analysis for pairs filtered as shown in Table 5. Where Well X Subtotal = Well Y Subtotal, the pair is represented twice. Plots of pairs are shown in HydroShare Folder HydrographsAndRegressionPlots/1.ProximalCorrelated. 
Records: numbers 1 through 84
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. m: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. PearsonR: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model

File name: RsProximalUncorrelated.csv 
Description: table summarizing regression analysis for pairs filtered as shown in Table 5. Where Well X Subtotal = Well Y Subtotal, the pair is represented twice. Plots of pairs are shown in HydroShare Folder HydrographsAndRegressionPlots/2.ProximalUncorrelated. 
Records: numbers 1 through 67
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. m: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. PearsonR: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model

File name: RsProximalUncorrelatedSimilar.csv 
Description: table summarizing regression analysis for pairs filtered as shown in Table 5. Where Well X Subtotal = Well Y Subtotal, the pair is represented twice. Plots of pairs are shown in HydroShare Folder HydrographsAndRegressionPlots/3.ProximalUncorrelatedSimilar. 
Records: numbers 1 through 8
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. m: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. PearsonR: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model

File name: RsSitesPredictive.csv
Description: table summarizing regression analysis for which the sum of the r-squared values for Well X qualified it to be defined as predictive 
Records: numbers 1 through 477
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. slope: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. R: Pearson's correlation coefficient (equal to the square root of R-squared2)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model
23. SiteXTotal: Total for well X equal to Subtotal plus s(SumR2) (score of sum of the R-squared values)
24. SiteYTotal: Total for well Y equal to Subtotal plus (SumR2) (score of sum of the R-squared values)
25. SiteX_SumR2: sum of the R-squared values for well X, used to prioritize wells in cases of equal Total
26. SiteY_SumR2: sum of the R-squared values for well Y, used to prioritize wells in cases of equal Total

e.HydrographsAndRegressionPlots FOLDER

This folder contains five subfolders with details explained in Table 5 of main text.

File naming structure in each folder: "HydrographsAndRegressionPlots_WellX_WellY.jpg". For example, "HydrographsAndRegressionPlots_AD_AF.jpg" where AD and AF are the site code, with AD as Well X and AF as Well Y. If the wells have identical subtotal scores, "HydrographsAndRegressionPlots_AF_AD.jpg" also exists, with AF as Well X and AD as Well Y.
Description: In all plots, grey or black represents well X and orange represents well Y. Subplot A shows the two wells in the Arkansas Headwaters Basin. Labels show the subtotal score for each well, the geologic unit, and the well depth in meters. Subplot B plots the groundwater elevation versus time for each site. Subplot C plots groundwater elevation deviations from the mean versus time for each site for the months for which both wells had measurements, where means were calculated from the full dataset for each well. Subplot D plots groundwater elevation for Well Y versus groundwater elevation for Well X. The linear regression of Y on X is shown in blue, and the grey band around the model represents the 95% confidence interval as calculated by predict() function in R. The p-value is based on the null hypothesis that the slope is zero.
Source: Maps were created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

f.PredictiveWellMaps FOLDER

Number of files: 93
Description: maps showing all the connections to each “X Well” 
File naming structure: "MapNetwork_AA.pdf" where AC and EA are the site codes. Only one file exists for each pair. 
Predictive well maps are included for all wells correlated to at least one other well with R-squared values greater than or equal to 0.6. Gray lines are an abridged geologic map. R-squared values are indicated by connecting line colors and thicknesses, with lighter, thicker lines representing higher values. The size of the circle represents Subtotal for each well. Pink polygons represent well batches (see text section 2.3.1). See HydrographsAndRegressionPlots folder for corresponding graphs.
Source: Maps were created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

g.Criteria Correlation FOLDER

Number of files: 2

File name: CriteriaCorrelationMatrix.csv
Description: matrix of Pearson’s R values between scoring criteria
Records: 11 scoring criteria
1. x: distance to closest well in kilometers
2. n: number of measurements  
3. y: number of years in period of record  
4. r: groundwater elevation range in meters 
5. mu: groundwater elevation mean in meters
6. d: constructed depth of well in meters
7. g: wells per geologic unit  
8. a: wells per aquifer  
9. T: tritium activity in picoCuries/Liter  
10. Q: geochemical signature rating (sum from Principal Component Analysis) 
11. SumR2: sum of the R-squared values of at least 0.6 for each higher or equal scoring well
Column details: 
Same criteria as above

File name: CriteriaCorrelationPlots.pdf
Description: grid of plots of each criterion versus each other criterion

h.FinalResults FOLDER

Number of files: 10

File name: AllGeometryWithData.csv
Description: Table of all sites, locations, and scoring of criteria including total score
Records: numbers 1 through 126
Column naming convention: s_[criterion] denotes the score for the criterion, with 1 being least preferred and 3 being most preferred.
Column details: 
1. Site: two-letter site code for present study  
2. site_no: USGS site number  
3. x_coord: UTM x coordinate (Zone 13)  
4. y_coord: UTM y coordinate (Zone 13)  
5. Active: Y if actively monitored, otherwise N 
6. x: distance to closest well in kilometers
7. s_x: score for distance, with highest values scoring highest  
8. n: number of measurements  
9. s_n: score for number of measurements, with highest values scoring highest  
10. y: number of years in period of record
11. s_y: score for years, with highest values scoring highest  
12. mu: groundwater elevation mean in meters  
13. s_mu: score for groundwater elevation mean, with highest values scoring highest  
14. r: groundwater elevation range in meters  
15. s_r: score for groundwater elevation range, with highest values scoring highest  
16. d: constructed depth of well in meters
17. s_d: score for constructed depth of well, with highest values scoring highest  
18. Aquifer: name of aquifer in which well was completed  
19. a: wells per aquifer  
20. s_a: score for wells per aquifer, with lowest values scoring highest  
21. Unit: abbreviation for mapped geologic unit in which well is located  
22. UNIT_AGE: age of geologic unit  
23. ROCKTYPE1: predominant lithology found in geologic unit  
24. ROCKTYPE2: second most predominant lithology in geologic unit  
25. g: wells per geologic unit  
26. s_g: score for wells per geologic unit , with lowest values scoring highest  
27. T: tritium activity in picoCuries/Liter  
28. s_T: score for tritium activity, with lowest values scoring highest  
29. MinDate: oldest date in period of record  
30. MaxDate: most recent date in period of record  
31. s_D: score for inclusion of drought years, with inclusion of 2000–2012 scoring highest  
32. Q: geochemical signature (sum from Principal Component Analysis)  
33. Qmax: maximum value for each geochemical signature grouping  
34. s_Q: score for geochemical signature
35. v: notes of other priority considerations (vertical data and low tritium values)
36. s_v: score for other priority considerations  
37. Subtotal: total of all previous scores  
38. Deg: number of wells correlated to higher or equal scoring well
39. SumR2: sum of r-squared values of at least 0.6 for each higher or equal scoring well 
40. s_SumR2: score of sum of r-squared values, with highest and NA values scoring highest  
41. Predictive: U for uncorrelated, Y for other wells with s_SumR2 = 3, otherwise N  
42. Total: Subtotal + s_SumR2  
43. BatchID: well batch number (wells in first quartile for distance), NA if not batched  
44. AvgSkew: mean of groundwater elevation skewness from all datasets paired with others  
45. Prioritized: Y if prioritized, otherwise N
46. geometry: site UTM coordinates (Zone 13) for use in sf package in R 

File name: BatchPrioritized.csv
Description: Prioritized well from each batch, corresponding to the actively monitored well, if present, and otherwise the top-scoring well in the batch
Records: numbers 1 through xx
Column naming convention: s_[criterion] denotes the score for the criterion, with 1 being least preferred and 3 being most preferred.
Column details: 
1. Site: two-letter site code for present study
2. x:  distance to nearest well in kilometers
3. s_x:  score for distance, with highest values scoring highest
4. n:  number of measurements
5. s_n:  score for number of measurements, with highest values scoring highest
6. y:  number of years of record
7. s_y:  score for period of record, with highest values scoring highest
8. mu:  mean groundwater elevation in meters
9. s_mu:  score for mean groundwater elevation, with highest values scoring highest
10. r:  groundwater elevation range in meters
11. s_r:  score for groundwater elevation range, with highest values scoring highest
12. d:  depth of well in meters
13. s_d:  score for depth of well, with highest values scoring highest
14. Aquifer:  name of aquifer in which well was completed
15. a: well count in that aquifer
16. s_a: score for aquifer count, with lowest values scoring highest
17. Unit: geologic unit 
18. UNIT_AGE: geologic age of formation
19. ROCKTYPE1: predominant lithology found in the formation  
20. ROCKTYPE2: second most predominant lithology in the formation  
21. Active: Y if well is actively monitored, otherwise N 
22. x_coord: UTM X coordinate of well (Zone 13)  
23. y_coord: UTM Y coordinate of well (Zone 13) 
24. g: wells per geologic unit  
25. s_g: score for wells per geologic unit , with lowest values scoring highest    
26. T: tritium activity in picoCuries/Liter    
27. s_T: score for tritium activity, with lowest values scoring highest    
28. MinDate: oldest date in period of record    
29. MaxDate: most recent date in period of record    
30. s_D: score for inclusion of drought years, with inclusion of 2000–2012 scoring highest   
31. Q: geochemical signature (sum from Principal Component Analysis)   
32. Qmax: maximum value for each geochemical signature grouping    
33. s_Q: score for geochemical signature 
34. v: notes of other priority considerations (vertical data and low tritium values) 
35. s_v: score for other priority considerations   
36. Subtotal: total of all previous scores   
37. Deg: number of wells correlated to higher or equal scoring well  
38. SumR2: sum of r-squared values of at least 0.6 for each higher or equal scoring well  
39. s_SumR2: score of sum of r-squared values, with highest and NA values scoring highest   
40. Predictive: U for uncorrelated, Y for other wells with s_SumR2 = 3, otherwise N  
41. Total: Subtotal + s_SumR2   
42. BatchID: well batch number (wells in first quartile for distance), NA if not batched    
43. AvgSkew: mean of groundwater elevation skewness from all datasets paired with others    
44. HighScore: top score of well batch (used in Script to extract wells for Total=HighScore)  

File name: MapsSitesExcluded.kml
Description: GIS-ready map of excluded wells 
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Records: 42
Column details:
1. Site: two-letter site code for present study  
2. site_no: USGS site number  
3. x_coord: UTM x coordinate (Zone 13)  
4. y_coord: UTM y coordinate (Zone 13)  
5. Active: Y if actively monitored, otherwise N 
6. x: distance to closest well in kilometers
7. s_x: score for distance, with highest values scoring highest  
8. n: number of measurements  
9. s_n: score for number of measurements, with highest values scoring highest  
10. y: number of years in period of record
11. s_y: score for years, with highest values scoring highest  
12. mu: groundwater elevation mean in meters  
13. s_mu: score for groundwater elevation mean, with highest values scoring highest  
14. r: groundwater elevation range in meters  
15. s_r: score for groundwater elevation range, with highest values scoring highest  
16. d: constructed depth of well in meters
17. s_d: score for constructed depth of well, with highest values scoring highest  
18. Aquifer: name of aquifer in which well was completed  
19. a: wells per aquifer  
20. s_a: score for wells per aquifer, with lowest values scoring highest  
21. Unit: abbreviation for mapped geologic unit in which well is located  
22. UNIT_AGE: age of geologic unit  
23. ROCKTYPE1: predominant lithology found in geologic unit  
24. ROCKTYPE2: second most predominant lithology in geologic unit  
25. g: wells per geologic unit  
26. s_g: score for wells per geologic unit , with lowest values scoring highest  
27. T: tritium activity in picoCuries/Liter  
28. s_T: score for tritium activity, with lowest values scoring highest  
29. MinDate: oldest date in period of record  
30. MaxDate: most recent date in period of record  
31. s_D: score for inclusion of drought years, with inclusion of 2000–2012 scoring highest  
32. Q: geochemical signature (sum from Principal Component Analysis)  
33. Qmax: maximum value for each geochemical signature grouping  
34. s_Q: score for geochemical signature
35. v: notes of other priority considerations (vertical data and low tritium values)
36. s_v: score for other priority considerations  
37. Subtotal: total of all previous scores  
38. Deg: number of wells correlated to higher or equal scoring well
39. SumR2: sum of r-squared values of at least 0.6 for each higher or equal scoring well 
40. s_SumR2: score of sum of r-squared values, with highest and NA values scoring highest  
41. Predictive: U for uncorrelated, Y for other wells with s_SumR2 = 3, otherwise N  
42. Total: Subtotal + s_SumR2  
43. BatchID: well batch number (wells in first quartile for distance), NA if not batched  
44. AvgSkew: mean of groundwater elevation skewness from all datasets paired with others  
45. Prioritized: Y if prioritized, otherwise N
46. geometry: site UTM coordinates (Zone 13) for use in sf package in R 

File name: MapsSitesExcludedWithLabels.pdf
Description: Map of excluded wells including labels
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

File name: MapsSitesPrioritized.kml
Description: GIS-ready map of prioritized wells
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Records: 287
Column details:
1. Site: two-letter site code for present study  
2. site_no: USGS site number  
3. x_coord: UTM x coordinate (Zone 13)  
4. y_coord: UTM y coordinate (Zone 13)  
5. Active: Y if actively monitored, otherwise N 
6. x: distance to closest well in kilometers
7. s_x: score for distance, with highest values scoring highest  
8. n: number of measurements  
9. s_n: score for number of measurements, with highest values scoring highest  
10. y: number of years in period of record
11. s_y: score for years, with highest values scoring highest  
12. mu: groundwater elevation mean in meters  
13. s_mu: score for groundwater elevation mean, with highest values scoring highest  
14. r: groundwater elevation range in meters  
15. s_r: score for groundwater elevation range, with highest values scoring highest  
16. d: constructed depth of well in meters
17. s_d: score for constructed depth of well, with highest values scoring highest  
18. Aquifer: name of aquifer in which well was completed  
19. a: wells per aquifer  
20. s_a: score for wells per aquifer, with lowest values scoring highest  
21. Unit: abbreviation for mapped geologic unit in which well is located  
22. UNIT_AGE: age of geologic unit  
23. ROCKTYPE1: predominant lithology found in geologic unit  
24. ROCKTYPE2: second most predominant lithology in geologic unit  
25. g: wells per geologic unit  
26. s_g: score for wells per geologic unit , with lowest values scoring highest  
27. T: tritium activity in picoCuries/Liter  
28. s_T: score for tritium activity, with lowest values scoring highest  
29. MinDate: oldest date in period of record  
30. MaxDate: most recent date in period of record  
31. s_D: score for inclusion of drought years, with inclusion of 2000–2012 scoring highest  
32. Q: geochemical signature (sum from Principal Component Analysis)  
33. Qmax: maximum value for each geochemical signature grouping  
34. s_Q: score for geochemical signature
35. v: notes of other priority considerations (vertical data and low tritium values)
36. s_v: score for other priority considerations  
37. Subtotal: total of all previous scores  
38. Deg: number of wells correlated to higher or equal scoring well
39. SumR2: sum of r-squared values of at least 0.6 for each higher or equal scoring well 
40. s_SumR2: score of sum of r-squared values, with highest and NA values scoring highest  
41. Predictive: U for uncorrelated, Y for other wells with s_SumR2 = 3, otherwise N  
42. Total: Subtotal + s_SumR2  
43. BatchID: well batch number (wells in first quartile for distance), NA if not batched  
44. AvgSkew: mean of groundwater elevation skewness from all datasets paired with others  
45. Prioritized: Y if prioritized, otherwise N
46. geometry: site UTM coordinates (Zone 13) for use in sf package in R 

File name: MapsSitesPrioritizedWithLabels.pdf
Description: Map of prioritized wells including labels
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.

File name: MapsSitesScored.kml
Description: GIS-ready map of all wells with total scores
Source: Map was created by the R script developed for the present study. See bibliography of R packages used and map data sources in section 4. REFERENCES below.
Coordinate Reference System: North American Datum 1983, Universal Transverse Mercator 13N, EPSG 26913
Records: 287
Column details:
1. Site: two-letter site code for present study  
2. site_no: USGS site number  
3. x_coord: UTM x coordinate (Zone 13)  
4. y_coord: UTM y coordinate (Zone 13)  
5. Active: Y if actively monitored, otherwise N 
6. x: distance to closest well in kilometers
7. s_x: score for distance, with highest values scoring highest  
8. n: number of measurements  
9. s_n: score for number of measurements, with highest values scoring highest  
10. y: number of years in period of record
11. s_y: score for years, with highest values scoring highest  
12. mu: groundwater elevation mean in meters  
13. s_mu: score for groundwater elevation mean, with highest values scoring highest  
14. r: groundwater elevation range in meters  
15. s_r: score for groundwater elevation range, with highest values scoring highest  
16. d: constructed depth of well in meters
17. s_d: score for constructed depth of well, with highest values scoring highest  
18. Aquifer: name of aquifer in which well was completed  
19. a: wells per aquifer  
20. s_a: score for wells per aquifer, with lowest values scoring highest  
21. Unit: abbreviation for mapped geologic unit in which well is located  
22. UNIT_AGE: age of geologic unit  
23. ROCKTYPE1: predominant lithology found in geologic unit  
24. ROCKTYPE2: second most predominant lithology in geologic unit  
25. g: wells per geologic unit  
26. s_g: score for wells per geologic unit , with lowest values scoring highest  
27. T: tritium activity in picoCuries/Liter  
28. s_T: score for tritium activity, with lowest values scoring highest  
29. MinDate: oldest date in period of record  
30. MaxDate: most recent date in period of record  
31. s_D: score for inclusion of drought years, with inclusion of 2000–2012 scoring highest  
32. Q: geochemical signature (sum from Principal Component Analysis)  
33. Qmax: maximum value for each geochemical signature grouping  
34. s_Q: score for geochemical signature
35. v: notes of other priority considerations (vertical data and low tritium values)
36. s_v: score for other priority considerations  
37. Subtotal: total of all previous scores  
38. Deg: number of wells correlated to higher or equal scoring well
39. SumR2: sum of r-squared values of at least 0.6 for each higher or equal scoring well 
40. s_SumR2: score of sum of r-squared values, with highest and NA values scoring highest  
41. Predictive: U for uncorrelated, Y for other wells with s_SumR2 = 3, otherwise N  
42. Total: Subtotal + s_SumR2  
43. BatchID: well batch number (wells in first quartile for distance), NA if not batched  
44. AvgSkew: mean of groundwater elevation skewness from all datasets paired with others  
45. Prioritized: Y if prioritized, otherwise N
46. geometry: site UTM coordinates (Zone 13) for use in sf package in R 

File name: SitesCorrelatedPrioritizedMaxByX.csv
Description: Table of the correlation with the highest R-squared value for prioritized wells
Records: numbers 1 through 31
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. slope: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. R: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model
23. SiteXTotal: Total for well X equal to Subtotal plus s(W) (score of weighted degree)
24. SiteYTotal: Total for well Y equal to Subtotal plus s(W) (score of weighted degree)
25. maxR2: highest R-squared value for correlated prioritized wells 

File name: SitesCorrelatedWithDataCombined.csv
Description: Table of all regression models and total scores for all correlated pairs
Records: numbers 1 through 928
Column details: 
1. SiteX: Two-letter code for well with Subtotal (initial score) greater than or equal to Well Y
2. SiteY: Two-letter code for well with Subtotal (initial score) less than or equal to Well X
3. SiteXSubtotal: Subtotal for well X 
4. SiteXGeol: geologic unit for well X
5. SiteXdepth: depth in meters for well X
6. SiteYSubtotal: Subtotal for well Y
7. SiteYGeol: geologic unit for well Y
8. SiteYdepth: depth in meters for well Y
9. Dist: distance between well X and well Y in kilometers
10. n: number of overlapping measurements
11. SiteXSkew: skew of paired site X groundwater elevations
12. SiteYSkew: skew of paired site Y groundwater elevations
13. StartDate: first year and month of overlapping data
14. EndDate: last year and month of overlapping data
15. Yint: y-intercept of linear regression model in meters
16. slope: slope of linear regression model
17. StdErr: standard error of slope of linear regression model
18. p.val: p-value is calculated in the linear regression (lm) function in R using two-sided t-distribution and the t-statistic.
19. R2: r-squared (coefficient of determination) 
20. R: Pearson's correlation coefficient (equal to the square root of R-squared)
21. ResidualSkew: skewness of residuals in linear regression model
22. ResidualMean: average of residual values in linear regression model
23. SiteXTotal: total score for well X 
24. SiteYTotal: total score for well Y 

File name: SitesPrioritized.csv
Description: Table of scoring for prioritized sites
Records: numbers 1 through 42
Column details: 
1. Site: two-letter site code for present study  
2. x: distance to closest well in kilometers  
3. s_x: score for distance, with highest values scoring highest  
4. n: number of measurements  
5. s_n: score for number of measurements, with highest values scoring highest  
6. y: number of years in period of record  
7. s_y: score for years, with highest values scoring highest  
8. mu: groundwater elevation mean in meters  
9. s_mu: score for groundwater elevation mean, with highest values scoring highest  
10. r: groundwater elevation range in meters  
11. s_r: score for groundwater elevation range, with highest values scoring highest  
12. d: constructed depth of well in meters  
13. s_d: score for constructed depth of well, with highest values scoring highest  
14. Aquifer: name of aquifer in which well was completed  
15. a: wells per aquifer  
16. s_a: score for wells per aquifer, with lowest values scoring highest  
17. Unit: abbreviation for mapped geologic unit in which well is located  
18. UNIT_AGE: age of geologic unit  
19. ROCKTYPE1: predominant lithology found in geologic unit  
20. ROCKTYPE2: second most predominant lithology in geologic unit  
21. Active: Y if actively monitored, otherwise N  
22. x_coord: UTM x coordinate (Zone 13)  
23. y_coord: UTM y coordinate (Zone 13)  
24. g: wells per geologic unit  
25. s_g: score for wells per geologic unit , with lowest values scoring highest  
26. T: tritium activity in picoCuries/Liter  
27. s_T: score for tritium activity, with lowest values scoring highest  
28. MinDate: oldest date in period of record  
29. MaxDate: most recent date in period of record  
30. s_D: score for inclusion of drought years, with inclusion of 2000–2012 scoring highest  
31. Q: geochemical signature (sum from Principal Component Analysis)  
32. Qmax: maximum value for each geochemical signature grouping  
33. s_Q: score for geochemical signature  
34. v: notes of other priority considerations (vertical data and low tritium values)  
35. s_v: score for other priority considerations  
36. Subtotal: total of all previous scores  
37. Deg: number of wells correlated to higher or equal scoring well  
38. SumR2: sum of r-squared values of at least 0.6 for each higher or equal scoring well  
39. s_SumR2: score of sum of r-squared values, with highest and NA values scoring highest  
40. Predictive: U for uncorrelated, Y for other wells with s_SumR2 = 3, otherwise N  
41. Total: Subtotal + s_SumR2  
42. BatchID: well batch number (wells in first quartile for distance), NA if not batched  
43. AvgSkew: mean of groundwater elevation skewness from all datasets paired with others  

SECTION 4. REFERENCES

References for R packages and map data are included below. See text for additional references.

Abatzoglou, J.T., 2013. Development of gridded surface meteorological data for ecological applications and modelling. International Journal of Climatology 33, 121–131. https://doi.org/10.1002/joc.3413

Attali, D., Baker, C., 2023. ggExtra: Add Marginal Histograms to “ggplot2”, and More “ggplot2” Enhancements.

Birk, M.A., 2023. measurements: Tools for Units of Measurement.

Colorado’s Decision Support Systems, 2023b. GIS Data by Category: District Boundaries [WWW Document]. 

Colorado’s Decision Support Systems, Colorado Water Conservation Board / Division of Water Resources. URL https://cdss.colorado.gov/gis-data/gis-data-by-category (accessed 8.14.23).

Colorado Department of Public Health and Environment, 2023a. Colorado City Boundaries [WWW Document]. CDPHE Open Data, Colorado Department of Public Health and Environment. URL https://data-cdphe.opendata.arcgis.com/datasets/CDPHE::colorado-city-boundaries/about (accessed 10.22.23).

Colorado Department of Public Health and Environment, 2023b. Colorado County Boundaries [WWW Document]. CDPHE Open Data, Colorado Department of Public Health and Environment. URL https://data-cdphe.opendata.arcgis.com/datasets/CDPHE::colorado-county-boundaries/about (accessed 10.22.23).
Corporation, M., Weston, S., 2022. doParallel: Foreach Parallel Adaptor for the “parallel” Package.

Csardi, G., Nepusz, T., 2006. The igraph software package for complex network research. InterJournal Complex Systems, 1695.

Csárdi, G., Nepusz, T., Traag, V., Horvát, S., Zanini, F., Noom, D., Müller, K., 2025. igraph: Network Analysis and Visualization in R. https://doi.org/10.5281/zenodo.7682609

DeCicco, L., Hirsch, R., Lorenz, D., Read, J., Walker, J., Carr, L., Watkins, D., Blodgett, D., Johnson, M., Krall, A., 2024. dataRetrieval: R packages for discovering and retrieving water data available from 
U.S. federal hydrologic web services. U.S. Geological Survey, Reston, VA. https://doi.org/10.5066/P9X4L3GE

Dunnington, D., 2023. ggspatial: Spatial Data Framework for ggplot2.

Esri, 2014. World_Shaded_Relief (Map Server) [WWW Document]. ArcGIS REST Services Directory. URL https://services.arcgisonline.com/ArcGIS/rest/services/World_Shaded_Relief/MapServer (accessed 4.27.24).

Grosjean, P., 2024. SciViews::R. UMONS, MONS, Belgium.

Hijmans, R.J., 2023. raster: Geographic Data Analysis and Modeling.

Kassambara, A., 2023. ggpubr: “ggplot2” Based Publication Ready Plots.

Komsta, L., Novomestky, F., 2022. moments: Moments, Cumulants, Skewness, Kurtosis and Related Tests.

Lang, M., Murdoch, D., R Core Team, 2024. backports: Reimplementations of Functions Introduced Since R-3.0.0.

Microsoft, Weston, S., 2022. foreach: Provides Foreach Looping Construct.

Patil, I., Makowski, D., Ben-Shachar, M.S., Wiernik, B.M., Bacher, E., Lüdecke, D., 2022. datawizard: An R Package for Easy Data Preparation and Statistical Transformations. Journal of Open Source Software 7, 4684. https://doi.org/10.21105/joss.04684

Pebesma, E., 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal 10, 439–446. https://doi.org/10.32614/RJ-2018-009

Pebesma, E., Bivand, R., 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016

Pebesma, E., Mailund, T., Hiebert, J., 2016. Measurement Units in R. R Journal 8, 486–494. https://doi.org/10.32614/RJ-2016-061

Pedersen, T.L., 2024. patchwork: The Composer of Plots.

Posit, 2023. RStudio 2023.12.1 Build 402 “Ocean Storm” Release. Posit. https://posit.co/downloads/.

R Core Team, 2024. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Schwalb-Willmann, J., 2024. basemaps: Accessing Spatial Basemaps in R.

Slowikowski, K., 2024. ggrepel: Automatically Position Non-Overlapping Text Labels with “ggplot2.”

Stoeser, D.B., Green, G.N., Morath, L.C., Heran, W.D., Wilson, A.B., Moore, D.W., Van Gosen, B.S., 2005. Preliminary integrated geologic map databases for the United States Central States: Montana, Wyoming, Colorado, New Mexico, Kansas, Oklahoma, Texas, Missouri, Arkansas, and Louisiana, - the State of Colorado. Open-File Report 2005-1351.

Tweto, O., 1979. Geologic Map of Colorado:  U.S. Geological Survey Special Geologic Map, scale 1:500,000.

Urbanek, S., 2022. jpeg: Read and write JPEG images.

U.S. Census, 2024. States: cb_2018_us_state_20m.zip [WWW Document]. Cartographic Boundary Shapefiles, US Census. URL https://www.census.gov/geographies/mapping-files/2018/geo/carto-boundary-file.html (accessed 1.20.24).

U.S. Geological Survey (USGS), 2023a. Digital Spatial Data Sets: 1:250,000-scale Hydrologic Units (huc250k) [WWW Document]. Hydrologic Unit Maps, Water Resources of the United States, United States Geological Survey. URL https://water.usgs.gov/GIS/huc.html (accessed 8.14.23).[JO1] [EF2] 

U.S. Geological Survey (USGS), 2023b. National Hydrography Dataset [WWW Document]. The National Map, United States Geological Survey. URL https://apps.nationalmap.gov/downloader/#/ (accessed 1.20.24).

U.S. Geological Survey (USGS), 2024, USGS water data for the Nation: U.S. Geological Survey National Water Information System database at https://doi.org/10.5066/F7P55KJN.

Wickham, H., 2007. Reshaping Data with the reshape Package. Journal of Statistical Software 21, 1–20.

Wickham, H., 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.

Wickham, H., François, R., Henry, L., Müller, K., Vaughan, D., 2023a. dplyr: A Grammar of Data Manipulation.

Wickham, H., Hester, J., Chang, W., Bryan, J., 2022. devtools: Tools to Make Developing R Packages Easier.

Wickham, H., Pedersen, T.L., Seidel, D., 2023b. scales: Scale Functions for Visualization.

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
Colorado Groundwater Association 2024 Harlan Erker Memorial Scholarship
Colorado Section of the American Water Resources Association 2023-2024 Rich Herbert Memorial Scholarship
U.S. Geological Survey Water Mission Area

How to Cite

Fahrney, E. E., D. Mays, C. P. Newman (2025). Codes and Results to Prioritize Wells for Groundwater Monitoring in the Arkansas River Headwaters Basin, Colorado, USA, HydroShare, http://www.hydroshare.org/resource/7e0192daba5e4b5ebdc3619a867507f6

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required