Google Dataset Search enables easier data accessibility
Environmental data available from NCEI expands its reach through a partnership with Google. Dataset Search has been created and launched as a dedicated search engine for environmental and social science datasets. The search enables easier, broader access to many NOAA datasets, such as weather, geophysical, and ocean records.
As the Nation’s chief archive of global weather, climate, geophysical, and ocean data, we steward the largest collection of environmental data on Earth. NCEI hosts over 38 petabytes of data, which includes a copy. The data are used by all levels of government and by individuals, as well as in both public and private sectors. Our authoritative environmental data, products, and services include observations by NOAA satellite, radar, and ground observing systems. Our data create economic opportunity, mitigate climate- and weather-related losses, and preserve ecological resources.
Expanding the reach of NOAA datasets through Google provides the public another avenue to find and use our data more easily. With the new search tool, opportunities open up for greater use of environmental data, according to Dr. Ed Kearns, Chief Data Officer for NOAA.
“This type of search has long been the dream for many researchers in the open data and sciences communities,” Kearns says. “And for NOAA, whose mission includes the sharing of our data with others, this tool is key to making our data more accessible to an even wider community of users.”
Ins and Outs of Datasets
Datasets consist of individual pieces of data and additional information about the data. First, there’s the raw data, for instance, a daily maximum temperature value. The dataset also includes metadata, or the data behind the data, that add details and history. This primarily includes where, when, how, and what time the data were recorded. This information helps the public and expert users understand the data and find the best results from a search.
When many individual data records are compiled into one group, a dataset is created. For example, a user may want to find historical temperature data from cities across the globe. The Global Historical Climate Network-Daily dataset, in this case, could be used to find this information from NOAA.
NOAA and other data repositories, such as NASA and Harvard’s Dataverse, also provide greater details about their datasets. Google’s search provides that information to the user:
- Who created the dataset
- When it was published
- How the data were collected
- The terms for using the data
Dataset Search guides users to the published dataset on the provider’s website or portal. A goal of Google’s is to encourage the use of metadata standards by the scientific community to strengthen the search engine’s capabilities and, ultimately, contribute to the wider use of the data.