Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

CZO to HydroShare Migration Report


Authors:
Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource.
Type: Resource
Storage: The size of this resource is 3.6 MB
Created: Jul 26, 2020 at 11:02 p.m.
Last updated: Jul 27, 2020 at 6:06 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 1500
Downloads: 30
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

From 2007 to 2019, the Critical Zone Observatories (CZOs) stored their data at their respective universities. A central catalog of metadata kept track of the datasets at https://criticalzone.org. With the transition from CZO to CZ clusters, it was agreed to centralize all datasets to HydroShare. This resource documents that transition. The Readme.md file gives an overview and description of what was done, as does the poster by Miguel Leon. Specifics on how metadata was stored on criticalzone.org can be found in "CZO Metadata Definitions.pdf". How that metadata translated into HydroShare is defined in "Metadata Mapping from CZO to HydroShare.xlsx" and the controlled vocabulary conversions are found in “Map CZO Variables to ODM2 VariableNames.xlsx".

Subject Keywords

Coverage

Temporal

Start Date:
End Date:

Content

Readme.md

CZO to HydroShare Migration Report

The Critical Zone Observatories (CZO) stored data at their respective universities since inception. A central catalog of datasets was added to the criticalzone.org website to help people searching for data. With the transition from CZ Observatories to CZ Clusters, it was agreed to centralize the datasets. HydroShare was picked as the repository for centralization and the migration was performed in 2019 and completed early 2020. Detailed documentation about the project can be found in the HydroShare resource,CZO to HydroShare Migration Resources. The Resource includes a poster by Miguel Leon which provides an overview. This project added two new features to HydroShare to help the CZOs: the CZO Title Builder and Keyword Autocomplete. The CZO's are the first community to be added to the HydroShare feature Communities. These will be described in more detail below.

Datasets

From the 10 CZOs and the national office, over 3,500 files consisting of roughly 50 GB of data were copied to HydroShare. All CZOs retained copies of the original datasets. Some data that already reside in database management systems or in other repositories were not copied. Links to the data have been placed in HydroShare to those datasets for completeness and ease of discovery. 600 datasets are stored in other repositories. If a dataset already has a DOI issued by another institution, that attribution should be specified for the citation. Otherwise, HydroShare's default citation will be shown. Having links to the other repositories allows researchers to search for datasets all in one place. The Resource contains a "CZO Copied Files Manifest.xlsx" which lists all files copied into HydroShare.

Metadata

The metadata catalog on criticalzone.org was mapped to the metadata fields related to resources in HydroShare. There is not a one-to-one relationship between the two sets of metadata, so any metadata from criticalzone.org that did not have a parallel metadata field in HydroShare was added under "Additional Metadata". The approach taken to exporting metadata from the legacy site is documented in "CZO Metadata Definitions.pdf" and a list of mappings can be found in "Metadata Mapping from CZO to HydroShare.xlsx" within the Resource.

Controlled Vocabulary

A major effort of this migration was to rationalize the unconstrained tags used in criticalzone.org into a controlled vocabulary to improve the ability to search across CZOs for the same kinds of data. The Observations Data Model 2 (ODM2) Variable Names were adopted as the controlled vocabulary for CZO datasets. 4300 terms were standardized into roughly 400 variable names. A complete list of the terms and translation is listed in "Map CZO Variables to ODM2 VariableNames.xlsx" in the Resource.

Keyword Autocomplete

The CZO community also wanted common controlled keywords to be suggested, improving the data quality and searchability of metadata for resources. The common vocabulary chosen for the CZO community is a subset of the ODM2 vocabulary for variable names published and maintained in the ODM2 controlled vocabulary. You can request new terms from the ODM2 controlled vocabulary website. You are not restricted to the controlled vocabulary list and you can enter other appropriate terms from the type-ahead interface or add custom subject keywords, inline.

CZO Title Builder

One goal of the CZO community is to enforce a common way for the data managers to enter the title of a resource. When creating a new resource in HydroShare, title builder allows a user to pick appropriate topics, add custom text and the time period to build a standardized title across all CZO data. The components of title builder are:

  1. "Select CriticalZone Region" - This is a pick list with abbreviations for theCZO regions, for resource representing more then one CZO select "Cross-CZO".
  2. Available Topics and Selected Topics from a controlled list of CZO topics. Select an available topic (or multiple topics) and hit " \>" to move it to the list of selected topics. To add new topics to the controlled list of CZO related topics, email help@cuahsi.org.
  3. Optional custom text - A free text field for custom text further describing the dataset.
  4. Location - A free text field for the location where the data were collected.
  5. Start Year , End Year , and Ongoing - Two 4 digit numeric date fields for the begin and end year, you can also leave the end year blank and select the ongoing check box instead to indicate that data collection for the resource is ongoing.

The entered information will result in a title being generated in this format:

"CZO Region-- topics -- optional text -- location (start year - end year)".

Example title:

LCZO -- Stream Water Chemistry, Stream Ecology -- Data and R scripts -- Eastern Puerto Rico (2009-2014)

CZO Community

Each CZO is set up as a group within HydroShare. A Community is a group of groups. So the individual CZOs are part of the CZO National Community. The goal is to create a structure that maintains individual and group control over resources while providing continuity for the larger community. Please see How to Set Up Access for a CZO Resource for how to do this. For researchers who are not part of a group, it provides a conceptual framework for finding CZO related datasets. The CZO National Community was the first community to be added to HydroShare. Please see How to Manage the CZO Community for more information. To navigate to the community from the HydroShare home page, select Collaboratefrom the top menu. Collaborations will then list Groups and Communities. Select Community and navigate to the CZO National Community.

Communities have a publicly accessible landing page that can be customized. The communities landing page lists the community products: public and discoverable resources that are products of the CZOs within the community. Please refer to thehelp on landing pages for more information.

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
National Science Foundation Development of a Critical Zone Observatory National Office EAR-1360760
National Science Foundation Development of a Critical Zone Observatory National Office EAR-1840388
National Science Foundation Development of a Critical Zone Observatory National Office EAR-1848978

Contributors

People or Organizations that contributed technically, materially, financially, or provided general support for the creation of the resource's content but are not considered authors.

Name Organization Address Phone Author Identifiers
Mark O'Brien CUAHSI
Zhiyu (Drew) Li University of Illinois at Urbana-Champaign Illinois, US
Luigi Marini University of Illinois at Urbana-Champaign

How to Cite

Lubinski, D., M. C. Leon, C. Bode, M. Seul, L. Derry (2020). CZO to HydroShare Migration Report, HydroShare, http://www.hydroshare.org/resource/076f2c9dae5943f1b9b96d1428ea938c

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required